摘要
针对卷积神经网络预测过程中内存使用量大,难以部署在内存受限设备上的问题,提出一种面向内存受限设备的新型卷积计算方法。该方法对输入矩阵中部分数据进行卷积计算,并将计算结果存储在临时内存;然后,将临时内存中的计算结果复制到输入矩阵不再使用的内存并重复上述步骤,从而实现对输入矩阵的卷积计算;最后,对单个卷积计算和LeNet进行验证。实验结果表明,该方法计算速度较直接卷积方法更快,且相比im2col、MEC和直接卷积方法,单个卷积计算内存平均使用量分别下降89.29%、82.60%和57.15%,LeNet内存使用量分别下降89.90%、82.21%和28.07%,有效降低了卷积神经网络的内存使用量,有助于在内存受限设备上部署使用。
In the prediction process of convolutional neural network,the memory consumption is large and it is difficult to deploy on memory-limited devices.This paper presents a novel convolution calculation algorithm for memory-limited devices.In this method,part of data in the input matrix is convolved and the result is stored in the temporary memory.Then,the calculation result in the temporary memory is copied to the memory no longer used by the input matrix and the above steps are repeated,so as to realize the convolution calculation of the input matrix.Finally,the single convolution calculation and LeNet are verified.The experimental results show that the average memory usage of single convolution calculation is reduced by 89.29%,82.60% and 57.15%,and the memory usage of LeNet is reduced by 89.90%,82.21% and 28.07% compared with im2col,MEC and direct convolution methods,respectively,when the calculation speed is faster than that of direct convolution method.It effectively reduces the memory usage of convolutional neural networks,which is helpful for the deployment on memory-limited devices.
作者
孙雁飞
王子牛
孙莹
亓晋
董振江
SUN Yanfei;WANG Ziniu;SUN Ying;QI Jin;DONG Zhenjiang(School of Internet of Things,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Jiangsu HPC and Intelligent Processing Engineer Research Center,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;College of Automation&College of Artificial Intelligence,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处
《南京邮电大学学报(自然科学版)》
北大核心
2022年第5期54-61,共8页
Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基金
国家自然科学基金(62172235)
中国博士后基金(2019M651923)
江苏省自然科学基金(BK20191381)资助项目。
关键词
深度学习
卷积计算
内存优化
数据复用
边缘设备
deep learning
convolution calculation
memory optimization
data reuse
edge device