期刊文献+

基于FPGA的深度学习目标检测系统的设计与实现 被引量:10

Design and implementation of FPGA-based deep learning object detection system
下载PDF
导出
摘要 针对当前深度学习目标检测算法计算复杂度高和内存需求大等问题,设计并实现了一种基于FPGA的深度学习目标检测系统。设计对应YOLOv2-Tiny目标检测算法的硬件加速器,对加速器各模块的处理时延建模,给出卷积计算模块的详细设计。实验结果表明,与CPU相比,CPU+FPGA的异构系统是双核ARM-A9能效的67.5倍,Xeon的94.6倍;速度是双核ARM-A9的84.4倍,Xeon的5.5倍左右。并且,当前设计在性能上超过之前的工作。 Aiming at the problems of higher computational complexity and larger memory requirements of current object detection algorithm, we designed and implemented an FPGA-based deep learning object detection system. We also designed the hardware accelerator corresponding to the YOLOv2-Tiny object detection algorithm, modeled the processing delay of each accelerator module, and describe the design of the convolution module. The experimental results show that it is 5.5x and 94.6x of performance and energy gains respectively when comparing with the software Darknet on an 8-core Xeon server, and 84.8x and 67.5x over the software version on the dual-core ARM cortex-A9 on Zynq. Also, the current design outperforms the previous work in performance.
作者 陈辰 严伟 夏珺 柴志雷 Chen Chen;Yan Wei;Xia Jun;Chai Zhilei(School of Internet of Things Engineering,Jiangnan University,Wuxi 214122,China;School of Software & Microelectronics,Peking University,Beijing 102600,China)
出处 《电子技术应用》 2019年第8期40-43,47,共5页 Application of Electronic Technique
关键词 深度学习 目标检测 FPGA 硬件加速器 deep learning object detection FPGA hardware accelerator
  • 相关文献

参考文献1

二级参考文献3

共引文献8

同被引文献48

引证文献10

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部