期刊文献+

面向卷积神经网络加速器吞吐量优化的FPGA自动化设计方法 被引量:8

Throughput-oriented Automatic Design of FPGA Accelerator for Convolutional Neural Networks
下载PDF
导出
摘要 针对卷积神经网络FPGA加速器的资源分配与频率设置欠佳导致吞吐量受限的问题,提出一种面向吞吐量优化的自动化设计方法.首先将加速器的设计分为并行策略和频率设计,提出总体设计流程;然后将设计空间探索建模为线段分割问题,采用遗传算法及贪心算法求解;最后根据求解出的并行策略完成加速器的结构设计,根据求解出的预期运行频率对加速器的布局布线优化,使实际频率可以达到预期.对AlexNet及VGG-16模型在目标器件AlteraDE5a-Net的设计实验结果表明,文中方法能有效地提升资源使用效率并给出合理频率设置;相比于其他卷积神经网络FPGA加速器设计方法,该方法可提升AlexNet和VGG-16的吞吐量82.95%和66.19%. The throughput of FPGA accelerator for convolutional neural network(CNN)is determined by parallel strategies and frequency.A throughput-oriented automatic design method is proposed in this paper.Firstly,an automatic design flow is proposed for the parallel strategy and the frequency of the accelerator.Then the design space exploration is formulated as a segment partition problem and is solved by a genetic and greedy algorithm.Finally,the FPGA accelerator design is implemented with the explored parallel strategy.The frequency of the accelerator is considered at the placement and routing stage to meet the design expectation.Two typical CNN models AlexNet and VGG-16 were implemented on the Altera DE5a-Net board by using the proposed method.The experimental results demonstrated that,the throughputs of AlexNet and VGG-16 could be improved by 82.95%and 66.19%respectively,in comparison with the state-of-the-art FPGA accelerators.
作者 陆维娜 胡瑜 叶靖 李晓维 Lu Weina;Hu Yu;Ye Jing;Li Xiaowei(State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100190)
出处 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2018年第11期2164-2173,共10页 Journal of Computer-Aided Design & Computer Graphics
基金 国家自然科学基金(61274030 61521092 61532017 61376043 61704174)
关键词 FPGA 卷积神经网络 加速器吞吐量 自动化并行设计 FPGA convolutional neural network accelerator throughput automatic parallel design
  • 相关文献

同被引文献43

引证文献8

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部