期刊文献+

基于多核的卷积神经网络加速方法与系统实现 被引量:2

Study on Realization of Convolutional Neural Network Acceleration Based on Multi-core
下载PDF
导出
摘要 分析表明,为了实现卷积神经网络的加速,经常通过并行排列多个卷积单元来实现。在理想情况下,卷积单元越多,处理速度越快。但是在实际应用中,数据带宽会大大限制卷积单元的处理速度,硬件的带宽资源非常珍贵,提高硬件的数据带宽代价巨大。因此,在有限的数据带宽和硬件开销下,提高卷积神经网络的处理速度,成为当前硬件架构设计急需解决的问题。鉴于以上所述现有技术的缺点,提供一种基于多核的卷积神经网络加速方法及系统、存储介质及终端,通过多个并行的卷积核节省卷积神经网络的数据带宽。在相同硬件数据带宽条件下,在卷积核中并行的向量点积运算提高卷积神经网络的处理速度。 At present,in order to accelerate the convolution neural network,it is often realized by arranging several convolution units in parallel.In ideal case,the more convolution units,the faster processing speed.But in practical application,data bandwidth will greatly limit the processing speed of convolution unit,hardware bandwidth resources are very precious,and the cost of improving hardware data bandwidth is huge.Therefore,under the limited data bandwidth and hardware overhead,improving the processing speed of convolutional neural network becomes an urgent problem in the current hardware architecture design.In view of the disadvantages of the prior art,the invention aims to provide a multi-core based convolutional neural network acceleration method and system,storage medium and terminal,and save the data bandwidth of the convolutional neural network through multiple parallel convolution cores;under the same hardware data bandwidth,improve the convolutional neural network by parallel vector dot product operation in the convolution core processing speed.
作者 张慧明 ZHANG Huiming(VeriSilicon Microelectronics Shanghai Co.,Ltd,Shanghai 201203,China)
出处 《集成电路应用》 2020年第5期10-13,共4页 Application of IC
基金 上海市高新技术企业科技创新课题项目。
关键词 卷积神经网络 数据带宽 机器学习 深度学习 convolutional neural network data bandwidth machine learning deep learning
  • 相关文献

参考文献1

共引文献15

同被引文献28

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部