摘要
为提高卷积神经网络(CNN)的计算效率和能效,以8 bit定点数据作为输入,设计一个支持激活、批标准化以及池化等CNN网络中常见计算类型的卷积加速器,优化循环计算顺序并将其与数据复用技术相结合,以提高卷积计算的效率。基于软硬件协同设计思想,构建包含RISC-V处理器和卷积加速器的SoC系统,RISC-V处理器基于开源的指令集标准,可以根据具体的设计需求扩展指令功能。将该SoC系统部署在Xilinx ZCU102开发板上,RISC-V处理器和卷积加速器分别工作在100 MHz和300 MHz频率下,测试结果表明,该加速器的算力达到153.6 GOP/s,运行VGG16网络进行图片推理计算时加速效果较好。
To improve the computation and energy efficiency of Convolutional Neural Network(CNN),this paper proposes a convolution accelerator with 8 bit fixed-point data as input.The accelerator supports common CNN calculations,including activation,Batch Normalization(BN)and pooling.By optimizing the loop computation order and adopting the data reuse strategy,the convolution computation efficiency is greatly improved.Based on the idea of the co-design of software and hardware,a SoC system including a RISC-V processor and the convolution accelerator is designed.The RISC-V processor is based on the open source instruction set,which makes it flexible to add instructions according to specific design requirements.The SoC system is deployed on the Xilinx ZCU102 board,where the RISC-V processor and the accelerator work at the frequency of 100 MHz and 300 MHz,respectively.The testing results show that the computing speed of the accelerator reaches 153.6 GOP/s.It provides a significant speedup for VGG16 network running for inference computation of pictures.
作者
张坤宁
赵烁
何虎
邓宁
杨旭
ZHANG Kunning;ZHAO Shuo;HE Hu;DENG Ning;YANG Xu(Institute of Microelectronics,Tsinghua University,Beijing 100084,China;School of Software,Beijing Institute of Technology,Beijing 100081,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第4期153-157,共5页
Computer Engineering
基金
国家自然科学基金(91846303)。