摘要
为解决自适应随机方差衰减梯度法(Adaptive Stochastic Variance Reduced Gradient Method,AdaSVRG)的初始学习率需要人工筛选耗费大量时间,AdaSVRG结合了小批量随机梯度下降,却忽略了不同样本属性之间的差异的问题,本文提出AdaSVRG的改进算法DeltaSVRG-M。该算法采用一个动态项来代替学习率的选择,并加入了传统动量项和批量归一化模块,实验在MNIST和CIFAR-10数据集上使用DeltaSVRG-M和AdaSVRG在图像分类算法上自动寻优。结果表明,DeltaSVRG-M的平均精确度可分别达到97.5%和84.8%,此外DeltaSVRG-M的收敛速度和收敛稳定性也要优于AdaSVRG,说明DeltaSVRG-M能提高图像分类算法性能,更适用于图像分类的网络架构。
In order to solve the problem that the initial learning rate of Adaptive Stochastic Variance Reduced Gradient Method(Ada SVRG) requires manual screening and consumes a lot of time. AdaSVRG combines small batch random gradient descent, but ignores the difference between different sample attributes, an improved algorithm DeltaSVRG-M of AdaSVRG is proposed in this paper. The algorithm uses a dynamic term to replace the selection of learning rate, and adds the traditional momentum term and batch normalization module. The e×periment uses DeltaSVRG-M and AdaSVRG to automatically optimize the image classification algorithm on MNIST and CIFAR-10 data sets. The results show that the average accuracy of DeltaSVRG-M can reach 97.5% and 84.8% respectively. Under the same conditions, it is 2.2% and 9.6% higher than AdaSVRG. In addition, the convergence speed and convergence stability of DeltaSVRG-M are also better than AdaSVRG, indicating that DeltaSVRG-M can improve the performance of image classification algorithm and is more suitable for the network architecture of image classification.
作者
郑平
邢春雨
丁松
王昌盛
ZHENG Ping;XING Chunyu;DING Song;WANG Changsheng(School of Electrical and Information Engineering,Anhui University of Science and Technology,Huainan Anhui 232001,China)
出处
《信息与电脑》
2022年第8期79-83,共5页
Information & Computer