摘要
现有软件实现方案难以满足卷积神经网络对运算性能与功耗的要求。为此,设计一种基于现场可编程门阵列(FPGA)的卷积神经网络加速器。在粗粒度并行层面对卷积运算单元进行并行化加速,并使用流水线实现完整单层运算过程,使单个时钟周期能够完成20次乘累加,从而提升运算效率。针对MNIST手写数字字符识别的实验结果表明,在75 MHz的工作频率下,该加速器可使FPGA峰值运算速度达到0.676 GMAC/s,相较通用CPU平台实现4倍加速,而功耗仅为其2.68%。
Aiming at the problem that existing software implementation schemes of Convolutional Neutral Network (CNN) cannot meet the requirements of computing performance and power consumption,this paper proposes a Field Programmable Gate Array (FPGA)-based accelerator for CNN.The convolution computation unit is paralled accelerated in the coarse-grained paralleled level and the whole process is fully pipelined.This optimization allows 20 multiplyaccumulations to finish in a single cycle,which greatly improves calculation efficiency.Experimental results for MNIST handwritten digits character recoghition show that the proposed FPGA-based accelerator can achieve peak performance of 0.676 GMAC/s under 75 MHz,and be 4 times faster than general CPU platform,while the power consumption is only 2.68percent of it.
出处
《计算机工程》
CAS
CSCD
北大核心
2017年第1期109-114,119,共7页
Computer Engineering
基金
国家"863"计划项目"CMC系列芯片的设计
开发与制造"(2012AA041701)