摘要
传统的卷积神经网络量化算法广泛使用对称均匀量化操作对模型权值进行量化,没有考虑到相邻权值量化之间的相互关系,即上一个权值的量化操作产生的量化噪声可以通过调整之后权值的量化方向加以弥补。针对上述问题,提出了一种基于权值交互思想的三值卷积神经网络量化算法,达到了16倍的模型压缩比,以ImageNet作为数据集,量化后的AlexNet和ResNet-18网络上模型预测准确率只下降了不到3%。该方法达到了较高的模型压缩比,具有较高的精度,可以用于将卷积神经网络移植到计算资源有限的移动端平台上。
Traditional convolutional neural network quantization algorithms widely use symmetric uniform quantization operations to quantize models ′ weights, without taking into account the correlation between the quantization of adjacent weights, that is, the quan-tization noise generated by the quantization operation of the previous weight can be made up after adjusting the quantitative direc-tion of the next weights. Aiming at the above problems, a ternary convolutional neural network quantization algorithm based on the idea of weight interaction is proposed, the model compression ratio is 16 times. On the ImageNet dataset, the model prediction ac-curacy of ternarized AlexNet and ResNet-18 network only decrease less than 3 %. This method achieves a high model compression ratio, has higher accuracy, and can be used to transplant convolutional neural networks to mobile platforms with limited computing resources.
作者
肖国麟
杨春玲
陈宇
Xiao Guolin;Yang Chunling;Chen Yu(School of Electrical Engineering and Automation,Harbin Institute of Technology,Harbin 150001,China)
出处
《电子技术应用》
2020年第10期39-41,共3页
Application of Electronic Technique
关键词
三值量化
卷积神经网络
权值交互
模型压缩
ternary quantization
convolutional neural network
weight interaction
model compression