求解非光滑强凸优化问题的减小方差加权随机算法

Stochastic Algorithm with Reduced Variance and Weighted Average for Solving Non-smooth Strongly Convex Optimization Problems

下载PDF

导出

摘要在光滑问题随机方法中使用减小方差策略,能够有效改善算法的收敛效果.文中同时引用加权平均和减小方差的思想,求解"L1+L2+Hinge"非光滑强凸优化问题,得到减小方差加权随机算法(α-HRMDVR-W).在每步迭代过程中使用减小方差策略,并且以加权平均的方式输出,证明其具有最优收敛速率,并且该收敛速率不依赖样本数目.与已有减小方差方法相比,α-HRMDVR-W每次迭代中只使用部分样本代替全部样本修正梯度.实验表明α-HRMDVR-W在减小方差的同时也节省CPU时间. Using the strategy of reducing the variance in smooth stochastic method can effectively improve the convergence of the algorithm. An algorithm, hybrid regularized mirror descent with reduced variance and weighted average （α-HRMDVR-W）, is obtained by using weighted average and reduced variance for solving ＂L1 ＋ L2 ＋ Hinge＂ non-smooth strong convex optimization problem. The variance reduction strategies are utilized at each step of the iterative process, and the weighted average of the output mode is used. It is proved that the α-HRMDVR-W has optimal convergence rate and the convergence rate does not depend on the number of samples. Unlike the existing variance reduction methods, α-HRMDVR-W only uses a small portion of samples instead of the total samples to modify the gradient at each iteration. Experimental results show that α-HRMDVR-W reduces the variance and decreases CPU time.

作者朱小辉陶卿

机构地区中国人民解放军陆军军官学院十一系

出处《模式识别与人工智能》 EI CSCD 北大核心 2016年第7期577-589,共13页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金项目(No.61273296)资助~~

关键词机器学习随机优化减小方差 Machine Learning, Stochastic Optimization, Reduced Variance

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献2

1朱小辉,陶卿,邵言剑,储德军.一种减小方差求解非光滑问题的随机优化算法[J].软件学报,2015,26(11):2752-2761. 被引量：5
2邵言剑,陶卿,姜纪远,周柏.一种求解强凸优化问题的最优随机算法[J].软件学报,2014,25(9):2160-2171. 被引量：11

二级参考文献20

1Tseng P. Approximation accuracy, gradient methods, and error bound for structured convex optimization. Mathematical Programming, 2010,125(2):263-295. [doi: 10.1007/sl0107-010-0394-2].
2Nemirovski A, Juditsky A, Lan G, Shapiro A. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 2009,19(4):1574-1609. [doi: 10.1137/070704277].
3Shalev-Shwartz S, Tewari A. Stochastic methods for LI regularized loss minimization. In: Proc. of the 26th Annual Int’l Conf. on Machine Learning. 2009. 929-936.
4Johnson R, Zhang T. Accelerating stochastic gradient descent using predictive variance reduction. In: Proc. of the Advances in Neural Information Processing Systems 26. 2013. 315-323.
5Shalev-Shwartz S, Zhang T. Stochastic dual coordinate ascent methods for regularized loss minimization. arXiv preprint arXiv: 1209.1873,2012.
6Le Roux N, Schmidt M, Bach F. A stochastic gradient method with an exponential convergence rate for strongly convex optimization with finite training sets. arXiv preprint, arXiv: 1202.6258, 2012.
7Xiao L. Dual averaging methods for regularized stochastic learning and online optimization. In: Advances in Neural Information Processing Systems. 2009. 2116-2124.
8Xiao L, Zhang T. A proximal stochastic gradient method with progressive variance reduction. arXiv: 1403.4699vl, 2014.
9Duchi J, Shalev-Shwartz S, Singer Y, Tewari A. Composite objective mirror descent. In: Proc. of the 23rd Annual Workshop on Computational Learning Theory. ACM Press, 2010. 116-128.
10Duchi J, Shalev-Shwartz S, Singer Y. Efficient projections onto the Ll-ball for learning in high dimensions. In: Proc. of the 25th Int’l Conf. on Machine Learning. 2008. 272-279.

共引文献14

1周柏,陶卿,储德军.基于随机步长具有最优瞬时收敛速率的稀疏随机优化算法[J].模式识别与人工智能,2015,28(10):876-885.
2夏浩,张丽杰.随机噪声干扰下的迭代学习控制器设计[J].计算机应用,2017,37(1):294-298. 被引量：3
3陶蔚,潘志松,朱小辉,陶卿.线性插值投影次梯度方法的最优个体收敛速率[J].计算机研究与发展,2017,54(3):529-536. 被引量：5
4王超,卢文龙.基于铁路施工图评审评价模型的原理与技术改进研究[J].铁道标准设计,2018,62(7):68-73. 被引量：2
5刘张虎,程春玲.面向大规模数据主题建模的方差减小的随机变分推理算法[J].计算机应用,2018,38(6):1675-1681. 被引量：1
6廖原.基于卷积网络的车牌二值化算法[J].信息与电脑,2018,30(12):67-68. 被引量：1
7任真,李四海.基于L1-L2联合范数约束的中药近红外光谱波长选择[J].计算机应用与软件,2018,35(12):99-103. 被引量：1
8杨震,齐兴斌.基于在线异常感知的无线电网络抗SSDF攻击协作频谱感知框架[J].计算机应用研究,2018,35(12):3791-3794. 被引量：3
9陈凯丽,李苏木.具有隐私保护的无投影分布式在线学习算法[J].赤峰学院学报（自然科学版）,2021,37(1):4-8. 被引量：1
10谢涛,张春炯,徐永健.基于历史梯度平均方差缩减的协同参数更新方法[J].电子与信息学报,2021,43(4):956-964. 被引量：5

1吴卫邦,朱烨雷,陶卿.一种非光滑损失坐标下降算法[J].计算机应用研究,2012,29(10):3688-3692. 被引量：2
2朱小辉,陶卿,邵言剑,储德军.一种减小方差求解非光滑问题的随机优化算法[J].软件学报,2015,26(11):2752-2761. 被引量：5
3陶卿,朱烨雷,罗强,孔康.一种基于Comid的非光滑损失随机坐标下降方法[J].电子学报,2013,41(4):768-775. 被引量：3
4陈佩,许晓鸣,张卫东,徐润生.Sluzek矩不变量方法的改进[J].电子学报,2000,28(10):45-48.
5王立,朱学峰.一种基于迭代Bagging的回归算法[J].控制工程,2009,16(1):59-61. 被引量：4
6Linux领域新宠Hinge BI 8．0[J].开放系统世界,2006(12):17-17.
7张涛,唐振民,吕建勇.一种基于低秩表示的子空间聚类改进算法[J].电子与信息学报,2016,38(11):2811-2818. 被引量：25
8邵言剑,陶卿,姜纪远,周柏.一种求解强凸优化问题的最优随机算法[J].软件学报,2014,25(9):2160-2171. 被引量：11
9姜纪远,夏良,章显,陶卿.一种具有O(1/T)收敛速率的稀疏随机算法[J].计算机研究与发展,2014,51(9):1901-1910. 被引量：3
10丁友东.矩形域上B－B曲面的凸性与强凸自然网的升阶不变性[J].中国科学技术大学学报,1994,24(2):202-206.

模式识别与人工智能

2016年第7期

浏览历史

内容加载中请稍等...

求解非光滑强凸优化问题的减小方差加权随机算法

参考文献2

二级参考文献20

共引文献14

相关作者

相关机构

相关主题

浏览历史