Fast Fourier transform convolutional neural network accelerator based on overlap addition

导出

摘要 In convolutional neural networks(CNNs), the floating-point computation in the traditional convolutional layer is enormous, and the execution speed of the network is limited by intensive computing, which makes it challenging to meet the real-time response requirements of complex applications. This work is based on the principle that the time domain convolution result equals the frequency domain point multiplication result to reduce the amount of floating-point calculations for convolution. The input feature map and the convolution kernel are converted to the frequency domain by the fast Fourier transform(FFT), and the corresponding point multiplication is performed. Then the frequency domain result is converted back to the time domain, and the output result of the convolution is obtained. In the shared CNN, the input feature map is much larger than the convolution kernel, resulting in many invalid operations. The overlap addition method is proposed to reduce invalid calculations and speed up network execution better. This work designs a hardware accelerator for frequency domain convolution and verifies its efficiency on the Xilinx Zynq UltraScale+MPSoC ZCU102 board. Comparing the calculation time of visual geometry group 16(VGG16) under the ImageNet dataset faster than the traditional time domain convolution, the hardware acceleration of frequency domain convolution is 8.5 times.

作者 You Chen Li Dejian Feng Xi Shen Chongfei Wei Jizeng Liu Yu

机构地区 School of Microelectronics Beijing Smart-Chip Microelectronics Technology Company Limited College of Intelligence and Computing

出处《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2024年第5期71-84,共14页 中国邮电高校学报（英文版）

基金 supported by the Project of the State Grid Corporation of China in 2022 (5700-201941501A-0-0-00) the National Natural Science Foundation of China (U21B2031)。

关键词 convolutional neural network(CNN) fast Fourier transform(FFT) overlap addition

分类号 O17 [理学—基础数学]

引文网络
相关文献

1王剑.数据安全在“数据要素×”工作中的研究[J].江西通信科技,2024(4):45-48.
2吴琼,李娜,徐瑜涓.基于Citespace肝癌症状群文献可视化分析[J].昆明医科大学学报,2024,45(11):59-66.
3WU Jingguo,ZHU Jingwei,XIONG Xiankui,YAO Haidong,WANG Chengchen,CHEN Yun.Research on High-Precision Stochastic Computing VLSI Structures for Deep Neural Network Accelerators[J].ZTE Communications,2024,22(4):9-17.
4王高峰,张卓石,高蔓,钱云.基于多特征融合卷积神经网络结合Transformer的电能质量扰动分类方法[J].北华大学学报（自然科学版）,2025,26(1):115-124.
5李宏伟,曾金艳,任瑞国.EEMD模态分解算法在振动数据噪声抑制中的应用[J].山西地震,2024(4):40-45.
6Xumin Ding,Zhuochao Wang,Guangwei Hu,Jian Liu,Kuang Zhang,Haoyu Li,Badreddine Ratni,Shah Nawaz Burokur,Qun Wu,Jiubin Tan,Cheng-Wei Qiu.Metasurface holographic image projection based on mathematical properties of Fourier transform[J].PhotoniX,2020,1(1):104-115. 被引量：6
7张传宗,王冬子,郭政鑫,桂林卿,肖甫.基于双流融合网络的非接触式IR-UWB人体动作识别方法[J].计算机科学,2025,52(1):221-231.
8孙方德,王敏,苗少波,刘宇,王港,朱进.基于SCD和LSM的星载红外图像实时拼接技术[J].航天技术与工程学报,2024,1(3):44-51.
9BAN Jian,LI Gongyan,XU Shaoyun.SPaRM: an efficient exploration and planning framework for sparse reward reinforcement learning[J].High Technology Letters,2024,30(4):344-355.
10Yong WU,Luo ZUO,Dongliang PENG,Zhikun CHEN.A lightweight clutter suppression algorithm for passive bistatic radar[J].Frontiers of Information Technology & Electronic Engineering,2024,25(11):1536-1551.

The Journal of China Universities of Posts and Telecommunications

2024年第5期

浏览历史

内容加载中请稍等...

Fast Fourier transform convolutional neural network accelerator based on overlap addition

相关作者

相关机构

相关主题

浏览历史