期刊文献+

自适应随机方差衰减梯度法优化算法研究 被引量:1

Research on Optimization Algorithm of Adaptive Random Variance Attenuation Gradient Method
下载PDF
导出
摘要 为解决自适应随机方差衰减梯度法(Adaptive Stochastic Variance Reduced Gradient Method,AdaSVRG)的初始学习率需要人工筛选耗费大量时间,AdaSVRG结合了小批量随机梯度下降,却忽略了不同样本属性之间的差异的问题,本文提出AdaSVRG的改进算法DeltaSVRG-M。该算法采用一个动态项来代替学习率的选择,并加入了传统动量项和批量归一化模块,实验在MNIST和CIFAR-10数据集上使用DeltaSVRG-M和AdaSVRG在图像分类算法上自动寻优。结果表明,DeltaSVRG-M的平均精确度可分别达到97.5%和84.8%,此外DeltaSVRG-M的收敛速度和收敛稳定性也要优于AdaSVRG,说明DeltaSVRG-M能提高图像分类算法性能,更适用于图像分类的网络架构。 In order to solve the problem that the initial learning rate of Adaptive Stochastic Variance Reduced Gradient Method(Ada SVRG) requires manual screening and consumes a lot of time. AdaSVRG combines small batch random gradient descent, but ignores the difference between different sample attributes, an improved algorithm DeltaSVRG-M of AdaSVRG is proposed in this paper. The algorithm uses a dynamic term to replace the selection of learning rate, and adds the traditional momentum term and batch normalization module. The e×periment uses DeltaSVRG-M and AdaSVRG to automatically optimize the image classification algorithm on MNIST and CIFAR-10 data sets. The results show that the average accuracy of DeltaSVRG-M can reach 97.5% and 84.8% respectively. Under the same conditions, it is 2.2% and 9.6% higher than AdaSVRG. In addition, the convergence speed and convergence stability of DeltaSVRG-M are also better than AdaSVRG, indicating that DeltaSVRG-M can improve the performance of image classification algorithm and is more suitable for the network architecture of image classification.
作者 郑平 邢春雨 丁松 王昌盛 ZHENG Ping;XING Chunyu;DING Song;WANG Changsheng(School of Electrical and Information Engineering,Anhui University of Science and Technology,Huainan Anhui 232001,China)
出处 《信息与电脑》 2022年第8期79-83,共5页 Information & Computer
关键词 随机方差衰减 Adadelta算法 传统动量 批量归一化 random variance attenuation Adadelta algorithm traditional momentum batch normalization
  • 相关文献

参考文献8

二级参考文献34

  • 1汪纪锋,蒋玉莲.基于自适应学习速率法的补偿模糊神经网络[J].重庆大学学报(自然科学版),2005,28(10):82-85. 被引量:3
  • 2Tandori K, Bemerkung fiber die parweise unabhangigen zufaHigen Groben, Acta Math. Hungar, 1986, 48:357-359.
  • 3Lehmann E. I, Some concepts of dependences Ann. Math. Statist, 1966, 43: 113-1153.
  • 4Lai T. L, Summability methods for i.i.d, random variables, Proc. Am. Math. Soc, 1974, 43: 253-261.
  • 5Chow Y. S. and Lai T. L, limiting behavior of weighted sums of independent random variables, Ann. Probab,1973, 1: 810-824.
  • 6Deniet Y. and Derriennic Y, Sur la convergence presque sfire, au sense de cesaro d'order a, 0< α< 1, devariables aleatoires independantes et identiquement distribuees, Probab. Rel. Fields, 1988, 79: 629-636.
  • 7Lorentz G. G, Borel and Banach propertied of methods of summation, Duke Math. J, 1955, 22: 129-141.
  • 8Etemadi N, An elementary proof of the strong law of large numbers, Z. W. vevw G, 1983, 55: 119-122.
  • 9Petrov V. V, Limit theorems for sums of independent random variables, Translated by Su Chun and Huang Keming from Russian to Chinese, Hefei USTC Press, 1991.
  • 10Li G, Strong convergence of random elements in Banach spaces, Sichuan Daxue Xuebao, 1988, 25:381-389.

共引文献153

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部