期刊文献+

一种高性能可重构深度卷积神经网络加速器 被引量:6

High performance reconfigurable accelerator for deep convolutional neural networks
下载PDF
导出
摘要 由于深度卷积神经网络的卷积层通道规模及卷积核尺寸多样,现有加速器面对这些多样性很难实现高效计算。为此,基于生物脑神经元机制提出了一种深度卷积神经网络加速器。该加速器拥有类脑神经元电路的多种分簇方式及链路组织方式,可以应对不同通道规模。设计了3种卷积计算映射,可以应对不同卷积核大小;实现了局部存储区数据的高效复用,可大量减少数据搬移,提高了计算性能。分别以目标分类和目标检测网络进行测试,该加速器的计算性能分别达498.6×10^9次/秒和571.3×10^9次/秒;能效分别为582.0×10^9次/(秒·瓦)和651.7×10^9次/(秒·瓦)。 In deep convolutional neural networks,the diversity of channel sizes and kernel sizes makes it difficult for existing accelerators to achieve efficient calculations. Therefore, based on the biological brain neuron mechanism, a deep convolutional neural network accelerator is proposed which can provide not only multiple clustering methods for brain-like neurons and link organization among brain-like neurons towards different channel sizes, but also three mapping methods for different convolution kernel sizes. The accelerator implements efficient reuse of local memory data, which greatly reduces the amount of data movement and improves the computing performance. Tested by the object classification network and object detection network, the accelerator's computational performance is 498.6 GOPS and 571.3 GOPS, respectively;the energy efficiency is 582.0 GOPS/W and 651.7 GOPS/W, respectively.
作者 乔瑞秀 陈刚 龚国良 鲁华祥 QIAO Ruixiu;CHEN Gang;GONG Guoliang;LU Huaxiang(Institute of Semiconductors,Chinese Academy of Sciences,Beijing,100083,China;University of the Chinese Academy of Sciences,Beijing,100049,China;Center for Excellence in Brain Science and Intelligence Technology,Chinese Academy of Sciences,Shanghai,200031,China;Semiconductor Neural Network Intelligent Perception and Computing Technology Beijing Key Lab,Beijing 100083,China)
出处 《西安电子科技大学学报》 EI CAS CSCD 北大核心 2019年第3期130-139,共10页 Journal of Xidian University
基金 中国科学院战略性先导科技专项(A类)超导计算机研发(XDA18000000) 北京市科技计划(Z181100001518006) 国家自然科学基金青年基金(61701473、61401423) 中国科学院STS计划(KFJ-STS-ZDTP-070) 中国科学院国防科技创新基金(CXJJ-17-M152)
关键词 深度神经网络 加速器 可重构结构 高性能 超大规模集成电路 deep neural networks accelerator reconfigurable architecture high performance very large scale integrated circuit
  • 相关文献

参考文献2

二级参考文献3

同被引文献13

引证文献6

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部