摘要
传统的卷积神经网络需要大量的运算单元和繁琐的数据存取,导致计算速度较慢,效率不高.本文设计了全新的数据块结构以充分利用数据复用,大大减少数据读取次数,并且全面调用FPGA的并行运算资源,同时进行多个乘加操作,实现了高效并行卷积计算电路.将权重和偏置参数分别融合、最优化量化,减少了内存占用.通过以VGG16作为测试网络,在识别Imagenet数据集时,精度仅损失了0.02%,在200 MHz的情况下,吞吐率达到了129.6 GOPS,功耗仅为5.26 W.
Traditional convolutional neural network requires a large number of computing units and too much data access,resulting in slow calculation speed and low efficiency.A new data block structure is designed to make full use of data multiplexing»greatly reducing the number of data reading and fully calling the parallel computing resources of the FPGA.In this way,multiple multiplication and addition operations are carried out simultaneously,to realize an efficient parallel convolution calculation circuit.The weight and bias parameters are separately fused,optimized and quantized to reduce memory usage.By using VGG16 as the test network,when identifying the Imagenet data set,the accuracy was only lost by 0.02%.In the case of 200 MHz,the throughput rate reached 129.6 GOPS and the power consumption was only 5.26W.
作者
隋远峰
常亮
赵思濛
常玉春
SUI Yuanfeng;CHANG Liang;ZHAO Simeng;CHANG Yuchun(School of Microelectronics,Dalian University of Technology,Dalian 116100,Liaoning,China;The Thirty Second Research Institute,China Electronics Technology Group Corporation,Shanghai 201808,China)
出处
《微电子学与计算机》
2021年第8期34-39,共6页
Microelectronics & Computer
基金
国家自然科学基金项目(11975066,61801450)
中央高校基本科研基金(DUT20RC(3)058)。
关键词
卷积神经网络
数据复用
FPGA
参数量化
convolutional neural network
data reuse
FPGA
parameter quantization