摘要
随着深度卷积神经网络对计算量与访存量的需求增加,网络压缩与加速技术成为近几年的研究热点。针对网络架构改进、伪量化等重训练方法在压缩深度卷积神经网络时存在算力要求大、数据集难获得以及部署周期长等缺点,论文提出一种有效利用卷积神经网络数值均衡以及批规范化和ReLU非线性激活组合信息特点的压缩加速方法,只需对预训练网络模型权值进行调整即可达到较好的压缩加速效果。该方法适合在FPGA或ASIC这类定制硬件上实现,并能够实现硬件逻辑资源、能耗、访存带宽以及物体检测精度之间的平衡。最后,在人脸检测任务上验证了该卷积神经网络低位宽量化推理方法的有效性。
With the increasing demand for computational resource and memory access by deep convolutional neural networks,network compression and acceleration technology have become a research hotspot in recent years.The retraining methods such as network architecture improvement and pseudo-quantization have the disadvantages of large computational resource consume,difficult data acquisition and difficult to deployment when compressing deep convolutional neural networks.This paper proposes an acceleration method that effectively utilizes the numerical equilibrium of convolutional neural networks and the combination of batch normalization and ReLU nonlinear activation.Reasonable compression and acceleration effect can be achieved by only adjust the weight of the pre-trained network model.The method is suitable for implementation on custom hardware such as FPGA or ASIC,and can balance between hardware logic resources utilization,power consumption,memory bandwidth and object detection accuracy.Finally,the effectiveness of the proposed method on the face detection task is shown.
作者
付强
姜晶菲
窦勇
FU Qiang;JIANG Jingfei;DOU Yong(School of Computer,National University of Defense Technology,Changsha 410073)
出处
《计算机与数字工程》
2019年第11期2671-2674,共4页
Computer & Digital Engineering
基金
核高基国家重大专项(编号:2018ZX01028101)
国家自然科学基金重点项目(编号:61732018)资助
关键词
卷积神经网络
量化
批规范化
convolutional neural network
quantification
batch normalization