期刊文献+

面向嵌入式FPGA的高性能卷积神经网络加速器设计 被引量:10

Design of High Performance Convolutional Neural Network Accelerator for Embedded FPGA
下载PDF
导出
摘要 针对基于嵌入式现场可编程门阵列(FPGA)平台的卷积神经网络加速器由于资源有限导致处理速度受限的问题,提出一种高性能卷积神经网络加速器.首先根据卷积神经网络和嵌入式FPGA平台的特点,设计软硬件协同操作架构;然后在存储资源和计算资源的限制下,分别提出二维直接内存存取分块和权衡数字信号处理单元与查找表使用的优化策略;最后针对人脸检测的应用,对SSD网络模型进行优化,采用软硬件流水结构,提高人脸检测系统的整体性能.在XilinxZC706开发板上实现此加速器,实验结果表明,该加速器可达到167.5 GOPS的平均性能和81.2帧/s的人脸检测速率,其平均性能和人脸检测速率是嵌入式GPU平台TX2的1.58倍. Convolutional neural network accelerators based on embedded FPGAs have limited processing speed due to limited resources.A high performance convolutional neural network accelerator is proposed in this paper.Firstly,according to the characteristics of convolutional neural network algorithms and embedded FPGA platforms,the software and hardware co-operation architecture is designed.Then,under the constraints of storage resources and computing resources,a 2D DMA blocking strategy and a strategy for balancing the usages of DSP and LUT are proposed.Finally,for the application of face detection,the SSD network model is optimized,and the hardware and software pipeline structure is adopted to improve the overall performance of the face detection system.The accelerator is implemented on Xilinx ZC706 board.The experimental results show that the accelerator can achieve an average performance of 167.5 GOPS and a face detection rate of 81.2 frames per second,which is 1.58 times that of the embedded GPU platform TX2.
作者 曾成龙 刘强 Zeng Chenglong;Liu Qiang(Tianjin Key Laboratory of Imaging and Sensing Microelectronic Technology,Tianjin 300072;School of Microelectronics,Tianjin University,Tianjin 300072)
出处 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2019年第9期1645-1652,共8页 Journal of Computer-Aided Design & Computer Graphics
基金 国家自然科学基金(61574099)
关键词 卷积神经网络 硬件加速 直接内存存取 人脸检测 现场可编程门阵列 convolutional neural network hardware acceleration DMA(direct memory access) face detection FPGA(field-programmable gate array)
  • 相关文献

同被引文献67

引证文献10

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部