摘要
针对当前卷积神经网络算法日趋复杂,基于通用处理器的软件实现方案运算性能难以满足实际应用实时性要求,而基于GPU的实现方案则存在高能耗、无法应用于嵌入式系统等问题,本文提出了一种使用高层次综合(HLS)实现的基于FPGA卷积神经网络加速器设计方案,采用SDSoC开发环境,在实现所需性能的同时节省了大量开发时间,实验结果表明,在输入图像为64*64*3情况下,本文提出的软硬件协同设计方案识别速度达到1. 86ms,相比CPU实现方案的识别速度266ms,加速比可达143,节约了88倍功耗。
In view of the increasing complexity of current convolutional neural network algorithms,the computational performance of software implementation CPU-based is difficult to meet the real-time requirements of practical applications,while the GPU-based implementation schemes have high energy consumption and can not be applied to embedded systems. An FPGA-based convolutional neural network accelerator design is realized by using high-level synthesis( HLS) implementation,and the SDSoC development environment is used to reduce a lot of development time while the required performance is achieved. The experimental results show that the input image is scale of 64* 64* 3,and the recognition speed of the software and hardware co-design scheme proposed reaches 1. 86 ms and the acceleration ratio can reach 143 by saving 88 times power consumption,which is compared with the recognition speed of 266 ms of the CPU implementation scheme.
作者
秦东辉
周辉
赵雄波
柳柱
Qin Donghui;Zhou Hui;Zhao Xiongbo;Liu Zhu(Beijing Aerospace Automatic Control Institute,Beijing 100854,China;National Aerospace Intelligence Control Technology Laboratory,Beijing 100854,China)
出处
《航天控制》
CSCD
北大核心
2019年第1期21-26,共6页
Aerospace Control
关键词
卷积神经网络
FPGA
硬件加速
SDSoC
Convolutional neutral network
Field programmable gate array
Hardware acceleration
SDSoC