期刊文献+

基于FPGA的卷积神经网络加速器 被引量:36

FPGA-based Accelerator for Convolutional Neural Network
下载PDF
导出
摘要 现有软件实现方案难以满足卷积神经网络对运算性能与功耗的要求。为此,设计一种基于现场可编程门阵列(FPGA)的卷积神经网络加速器。在粗粒度并行层面对卷积运算单元进行并行化加速,并使用流水线实现完整单层运算过程,使单个时钟周期能够完成20次乘累加,从而提升运算效率。针对MNIST手写数字字符识别的实验结果表明,在75 MHz的工作频率下,该加速器可使FPGA峰值运算速度达到0.676 GMAC/s,相较通用CPU平台实现4倍加速,而功耗仅为其2.68%。 Aiming at the problem that existing software implementation schemes of Convolutional Neutral Network (CNN) cannot meet the requirements of computing performance and power consumption,this paper proposes a Field Programmable Gate Array (FPGA)-based accelerator for CNN.The convolution computation unit is paralled accelerated in the coarse-grained paralleled level and the whole process is fully pipelined.This optimization allows 20 multiplyaccumulations to finish in a single cycle,which greatly improves calculation efficiency.Experimental results for MNIST handwritten digits character recoghition show that the proposed FPGA-based accelerator can achieve peak performance of 0.676 GMAC/s under 75 MHz,and be 4 times faster than general CPU platform,while the power consumption is only 2.68percent of it.
出处 《计算机工程》 CAS CSCD 北大核心 2017年第1期109-114,119,共7页 Computer Engineering
基金 国家"863"计划项目"CMC系列芯片的设计 开发与制造"(2012AA041701)
关键词 卷积神经网络 现场可编程门阵列 加速器 流水线 并行化 accelerator pipeline parallelization
  • 相关文献

参考文献2

二级参考文献26

  • 1Bengio Y, et al. Greedy Layer-Wise Training of Deep Networks [ C ]// NIPS ,2007.
  • 2Arel I,et al. Deep Machine Learning-A New Frontier in Artificial In- telligence Research [ J ]. Computational Intelligence Magazine , IEEE, 2010,5(1) :13 -18.
  • 3Hinton G E ,et al. A Fast Learning Algorithm for Deep Belief Nets[ J]. Neural Computation ,2006,18 : 1527 - 1554.
  • 4Pouhney C, et al. Efficient Learning of Sparse Representations with an Energy-Based Model[ M ]. Presented at the NIPS, New York ,2006.
  • 5Dahl G,et al. Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition[ J]. IEEE Transactions on Audi- o, Speech, and Language Processing,2011,20:30 - 42.
  • 6Lti G. Recognition of multi-fontstyle characters based on Convolutional neural network [ C ]//Presented at the Computational Intelligence and Design ( ISCID), HANGZHOU ,2011.
  • 7Ackley H ,et aL A learning algorithm for Boltzmann machines[ J]. Cog- nitive Science, 1985,9 : 147 - 169.
  • 8Hinton G. Training products of experts by minimizing contrastive diver- gence[ J]. Neural Computation ,2002,14 : 1771 - 1800.
  • 9Hardisty E, Resnik P. Gibbs Sampling for the Uninitiated [ M ]. Ber- noulli 4956,2010.
  • 10Bergstra J, et al. Theano : A CPU and GPU Math Expression Compiler [C]//Presented at the the Python for Scientific Computing Confer- ence ,2010.

共引文献23

同被引文献137

引证文献36

二级引证文献113

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部