摘要
当前,应用广泛的一阶深度学习优化器包括学习率非自适应优化器和学习率自适应优化器,前者以SGDM为代表,后者以Adam为代表,这两类方法都使用指数滑动平均法来估计总体的梯度.然而使用指数滑动平均法来估计总体梯度是有偏差且具有滞后性的,本文提出基于差分修正的SGDM算法——RSGDM算法.我们的贡献主要有3点:1)分析SGDM算法里指数滑动平均法带来的偏差和滞后性.2)使用差分估计项来修正SGDM算法里的偏差和滞后性,提出RSGDM算法.3)在CIFAR-10和CIFAR-100数据集上实验证明了在收敛精度上我们的RSGDM算法比SGDM算法更优.
Currently,the widely used first-order deep learning optimizers include non-adaptive learning rate optimizers such as SGDM and adaptive learning rate optimizers like Adam,both of which estimate the overall gradient through exponential moving average.However,such a method is biased and hysteretic.In this study,we propose a rectified SGDM algorithm based on difference,i.e.RSGDM.Our contributions are as follows:1)We analyze the bias and hysteresis triggered by exponential moving average in the SGDM algorithm.2)We use the difference estimation term to correct the bias and hysteresis in the SGDM algorithm,and propose the RSGDM algorithm.3)The experiments on CIFAR-10 and CIFAR-100 datasets proves that our RSGDM algorithm is higher than the SGDM algorithm in convergence accuracy.
作者
袁炜
胡飞
YUAN Wei;HU Fei(School of Mathematics,Tianjin University,Tianjin 300350,China)
出处
《计算机系统应用》
2021年第7期220-224,共5页
Computer Systems & Applications
关键词
深度学习
一阶优化器
SGDM算法
差分
deep learning
first order optimization
SGDM algorithm
difference