期刊文献+

面向低延时目标检测的FPGA神经网络加速器设计

FPGA Neural Network Accelerator for Low-Latency Object Detection
下载PDF
导出
摘要 目标检测网络算法具有更高的检测精度,但庞大的计算复杂度使得传统硬件难以满足实时计算需求。为此,一种面向低延时目标检测的FPGA神经网络加速器被设计研究。该加速器能够支持高并行卷积稀疏计算,进而优化计算延时;同时设计了集中式存储阵列结构,能够实现存储阵列和计算阵列非一一对应的数据交互。基于Xilinx VCU118开发板和YOLOv3深度神经网络的测试结果显示,加速器单帧延时只有24.36 ms,并具有2704 GOPS的吞吐性能和更高的面积效率。 The object detection network algorithms have higher detection accuracy.However,the huge computational complexity makes it difficult for traditional processors to realize real-time processing.Therefore,a neural network accelerator based on FPGA is proposed for low-latency object detection.It can support high-parallel convolutional sparse calculating,which improves the parallelism and reduces the calculation delay.Also,a centralized storage array structure is designed to achieve non-one-to-one data interaction between storage array and comput⁃ing array.Finally,the YOLOv3 network is implemented on the Xilinx VCU118 development board.The accelerator delay is only 24.36 ms,achieving 2704 GOPS throughput and higher area efficiency.
作者 郑思杰 李杰 贺光辉 ZHENG Sijie;LI Jie;HE Guanghui(School of Electronic Information and Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240;Shanghai Academy of Spaceflight Technology(SAST),Shanghai 201109)
出处 《现代计算机》 2021年第18期38-43,共6页 Modern Computer
基金 国家重点研发计划项目(No.2019YFB2204500) 上海航天先进技术联合研究基金项目(No.USCAST2019-28)。
关键词 FPGA加速器 目标检测 卷积神经网络 低延时 稀疏计算 FPGA Accelerator Object Detection Convolutional Neural Network Low-Latency Sparse Calculating
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部