期刊文献+

基于脉动阵列的卷积计算模块硬件设计 被引量:1

Hardware design of convolution calculation module based on systolic array
下载PDF
导出
摘要 针对FPGA实现卷积神经网络中卷积计算的过程中,高并行度带来长广播、多扇入/扇出的数据通路问题,采用脉动阵列来实现卷积神经网络中卷积计算模块,将权重固定到每个处理单元中,并按照输入和输出特征图的维度来设置脉动阵列的大小,最后通过Vivado高层次综合实现卷积计算模块的硬件设计。实验结果表明,本设计在实现1级流水化时序要求的同时,具有较低的资源占用和良好的扩展性。 Aiming at the long broadcast, much fan in/fan out data path problem brought by high parullelism in the process of the Field Programmable Gate Array(FPGA) to realize the convolution computation in convolutional neural network, this paper adopts pulse array to realize convolution calculation module of convolutional neural network, fixes weights to each processing unit, according to the dimension of the input and output characteristic figure sets to pulse array size, and finally by Vivado high level synthesis real-izes convolution calculation module hardware design. The experimental results show that the design has low resource occupancy and good expansibility while realizing the time-series requirements of level 1 pipelining.
作者 王春林 谭克俊 Wang Chunlin;Tan Kejun(Information Science and Technology College,Dalian Maritime University,Dalian 116026,China)
出处 《电子技术应用》 2020年第1期57-61,共5页 Application of Electronic Technique
关键词 FPGA 脉动阵列 卷积计算 高层次综合 FPGA systolic array convolution computation high level synthesis
  • 相关文献

参考文献3

二级参考文献13

  • 1Brown R G. Introduction to random signal analysis and Kalman filtering[M]. New York.. Wiley, 1983.
  • 2GolubGH,VanLoanCF.矩阵计算[M].袁亚湘,译.北京:科学出版社,2011.
  • 3Cappello J D, Strenski D. A practical measure of FP- GA floating point acceleration for High Performance Computing[C]//Proceedings of the 2003 IEEE 24th International Conference on Application-specific Sys- tems. Washington,DC: IEEE, 2013 : 160-167.
  • 4Bensaali F, Amira A, Sotudeh R. Floating-point ma- trix product on FPGA[C]// proceedings of the IEEE/ ACS International Conference onComputer Systems and Applications, 2007 Computer Systems and Appli- cation. Amman: IEEE, 2007: 466-473.
  • 5Kestur S, Davis J D, Chung E S. Towards a universal FPGA matrix-vector multiplication architecture[C]/// proceeding of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). Toronto,oN:IEEE, 2012: 9-16.
  • 6Fleming S T, Thomas D 13. Hardware acceleration of matrix multiplication over small prime finite fields[C] // proceedings of 9th international Conference on Reconofigurable Computing: archiectures tools, and ap- plictions. Berlin, Heiolelberg.-ACM,2013 : 103-114.
  • 7田翔,周凡,陈耀武,刘莉,陈耀.基于FPGA的实时双精度浮点矩阵乘法器设计[J].浙江大学学报(工学版),2008,42(9):1611-1615. 被引量:21
  • 8刘沛华,鲁华祥,龚国良,刘文鹏.基于FPGA的全流水双精度浮点矩阵乘法器设计[J].智能系统学报,2012,7(4):302-306. 被引量:8
  • 9傅天驹,郑嫦娥,田野,丘启敏,林斯俊.复杂背景下基于深度卷积神经网络的森林火灾识别[J].计算机与现代化,2016(3):52-57. 被引量:33
  • 10林付春,刘宇红,张达峰,张荣芬.基于深度学习的智能路牌识别系统设计[J].电子技术应用,2018,44(6):68-71. 被引量:12

共引文献32

同被引文献14

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部