期刊文献+

基于Tiny-YOLOv3的网络结构化压缩与加速 被引量:2

Structured Compression and Acceleration of Network Based on Tiny-YOLOv3
下载PDF
导出
摘要 针对特定应用场景下,Tiny-YOLOv3(You Only Look Once v3)网络在嵌入式平台部署时存在资源开销大、运行速度慢的问题,文中提出了一种结合剪枝与量化的结构化压缩方案,并搭建了针对压缩后网络的卷积层加速系统。结构化压缩方案使用稀疏化训练与通道剪枝来减少网络中的计算量,使用激活值定点数量化和权重二的整数次幂量化来减少网络卷积层中的参数存储量。在卷积层加速系统中,可编程逻辑部分按照并行加流水线方法设计了一个卷积层加速器核,处理系统部分负责卷积层加速系统调度。实验结果表明,Tiny-YOLOv3经过结构化压缩后的网络平均准确度为0.46,参数压缩率达到了5%。卷积层加速系统在Xilinx的ZYNQ芯片进行部署时,硬件可以稳定运行在250 MHz时钟频率下,卷积运算单元的算力为36 GOPS。此外,加速平台整体功耗为2.6 W,且硬件设计节约了硬件资源。 In particular application scenarios,Tiny-YOLOv3 network has problems of high resource cost and slow running speed when deployed on embedded platform.This study proposes a structured compression scheme combining pruning and quantization,and establishes a convolutional layer acceleration system for compressed network.The structured compression scheme uses sparse training and channel pruning to reduce the amount of computation in the network,and utilizes fixed-point quantization of activation value and integer power quantization of weight two to reduce the storage of parameters in the network convolution layer.In the convolution layer accelerator system,the programmable logic part designs a convolution layer accelerator core according to the parallel plus pipeline method,and the processing system part is responsible for the scheduling of the convolution layer accelerator system.The experimental results show that the mean average precision of Tiny-YOLOv3 network after structured compression is 0.46,and the parameter compression ratio reaches 5%.When the convolution layer acceleration system is deployed on Xilinx ZYNQ chip,the hardware can run stably at 250 MHz clock frequency,and the calculation force of the convolution operation unit is 36 GOPS.In addition,the overall power consumption of the acceleration platform is 2.6 W,and the hardware design greatly saves hardware resources.
作者 胡永阳 李淼 孟凡开 张峰 孟艺薇 宋宇鲲 HU Yongyang;LI Miao;MENG Fankai;ZHANG Feng;MENG Yiwei;SONG Yukun(National ASIC Design Engineering Center,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;School of Microelectronics,Hefei University of Technology,Hefei 230009,China;School of Information Engineering,Capital Normal University,Beijing 100048,China)
出处 《电子科技》 2023年第8期43-48,55,共7页 Electronic Science and Technology
基金 国家重点研发计划(2018YFB2202604)。
关键词 目标检测网络 Tiny-YOLOv3 神经网络压缩 结构化剪枝 量化 硬件加速 流水线 ZYNQ object detection network Tiny-YOLOv3 neural network compression structural pruning quantization hardware acceleration pipeline ZYNQ
  • 相关文献

参考文献6

二级参考文献20

共引文献37

同被引文献16

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部