摘要
二值卷积神经网络(BNNs)由于其占用空间小、计算效率高而受到关注.但由于量化激活特征的正负部分分布不均等问题,二值网络和浮点深度神经网络(DNNs)之间存在着明显的性能差距,影响了其在资源受限平台上的部署.二值网络性能受限的主要原因是特征离散性造成的信息损失以及分布优化不当造成的语义信息消失.针对此问题,应用特征分布调整引导二值化,通过调整特征的均值方差均衡特征分布,减小离散性造成的信息损失.同时,通过分组激励与特征精调模块设计,调整优化量化零点位置,均衡二值化激活分布,最大程度保留语义信息.实验表明,所提出方法在不同骨干网络、使用不同数据集时均能取得较好效果,其中在CIFAR-10上使用ResNet-18网络量化后网络准确率仅损失0.4%,高于当前主流先进二值量化算法.
In recent years,binary neural networks(BNNs)have received attention due to their small memory consumption and high computational efficiency.However,there exists a significant performance gap between BNNs and floating-point deep neural networks(DNNs)due to problems,such as imbalanced distributions of positive and negative parts of quantized activation features,which affects their deployment on resource-constrained platforms.The main reason for the limited accuracy of binary networks is the information loss caused by feature discretization and the disappearance of semantic information caused by improper distribution optimization.To address this problem,this paper applies feature distribution adjustment to guide binarization,which adjusts the mean-variance of features to balance the feature distribution and reduce the information loss caused by discretization.At the same time,through the design of group excitation and feature finetuning module,the quantization zero points are optimized to balance the binarization activation distributions and retain the semantic information to the maximum extent.Experiments show that the proposed method achieves better results on different backbone networks using different datasets,in which only 0.4%of accuracy is lost after binarizing ResNet-18 on CIFAR-10,which surpasses the current mainstream BNNs.
作者
刘畅
陈莹
LIU Chang;CHEN Ying(Key Laboratory of Advanced Process Control for Light Industry of Ministry of Education,Jiangnan University,Wuxi 214122,China)
出处
《控制与决策》
EI
CSCD
北大核心
2024年第6期1840-1848,共9页
Control and Decision
基金
国家自然科学基金项目(62173160)。
关键词
特征分布
均值方差调整
语义信息保留
模型压缩
二值神经网络
模型量化
feature distribution
mean and variance adjustment
semantic information speicherung
model compression
binary neural networks
neural network quantization