期刊文献+

基于3D-Winograd的快速卷积算法设计及FPGA实现 被引量:1

Design and FPGA implementation of fast convolution algorithm based on 3D-Winograd
下载PDF
导出
摘要 近年来,卷积神经网络(CNN)已被计算机视觉任务广泛采用。由于FPGA的高性能、能效和可重新配置性,已被认为是最有前途的CNN硬件加速器,但是受FPGA计算能力、存储资源的限制,基于传统Winograd算法计算三维卷积的FPGA解决方案性能还有提升的空间。首先,研究了适用于三维运算的Winograd算法一维展开过程;然后,通过增加一次性输入特征图和卷积块的维度大小、低比特量化权重和输入数据等方法改善CNN在FPGA上的运行性能。优化思路包括使用移位代替部分除法的方法、分tile方案、二维到三维扩展及低比特量化等4个部分。相对传统的二维Winograd算法,优化算法每个卷积层的时钟周期数减少了7倍左右,相较传统滑窗卷积算法平均每个卷积层减少7倍左右。通过研究,证明了基于一维展开的3D-Winograd算法可以大大减少运算复杂度,并改善在FPGA运行CNN的性能。 In recent years,Convolutional Neural Networks(CNNs)have been widely adopted by computer vision tasks.Due to the high performance,energy efficiency,and reconfigurability of FPGA,it has been considered as the most promising CNN hardware accelerator.However,the existing FPGA solutions based on the traditional Winograd method are usually limited by FPGA computing power and storage resources,and there is room for improvement in performance of 3 D convolution operations.This paper first studied the one-dimensional expansion process of the Winograd algorithm suitable for three-dimensional operations;then,improved the performance of CNN on FPGA by increasing the one-time input feature map and the dimensional size of the convolution block,low-bit quantization weight and input data.The optimization ideas include four parts:the method of using shift instead of partial division,the division of tiles,the expansion of two-dimensional to three-dimensional,and low-bit quantization.Compared with the traditional two-dimensional Winograd algorithm,the number of clock cycles of each convolutional layer of the optimized algorithm is reduced by about 7 times,which is about 7 times less for each convolutional layer than the traditional sliding window convolution algorithm.Through the research,it is proved that the 3 D-Winograd algorithm based on one-dimensional expansion can greatly reduce the computational complexity and improve the performance of running CNN on FPGA.
作者 林珂玉 姜宏旭 张永华 丛容子 LIN Keyu;JIANG Hongxu;ZHANG Yonghua;CONG Rongzi(Beijing Key Laboratory of Digital Media,Beihang University,Beijing 100083,China)
出处 《北京航空航天大学学报》 EI CAS CSCD 北大核心 2021年第9期1900-1907,共8页 Journal of Beijing University of Aeronautics and Astronautics
基金 航天科学技术基金(190109) 国家自然科学基金(61872017)。
关键词 卷积神经网络(CNN) FPGA Winograd 卷积算法 快速算法 Convolutional Neural Network(CNN) FPGA Winograd convolution algorithm fast algorithm
  • 相关文献

同被引文献11

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部