期刊文献+

求解非光滑强凸优化问题的减小方差加权随机算法

Stochastic Algorithm with Reduced Variance and Weighted Average for Solving Non-smooth Strongly Convex Optimization Problems
下载PDF
导出
摘要 在光滑问题随机方法中使用减小方差策略,能够有效改善算法的收敛效果.文中同时引用加权平均和减小方差的思想,求解"L1+L2+Hinge"非光滑强凸优化问题,得到减小方差加权随机算法(α-HRMDVR-W).在每步迭代过程中使用减小方差策略,并且以加权平均的方式输出,证明其具有最优收敛速率,并且该收敛速率不依赖样本数目.与已有减小方差方法相比,α-HRMDVR-W每次迭代中只使用部分样本代替全部样本修正梯度.实验表明α-HRMDVR-W在减小方差的同时也节省CPU时间. Using the strategy of reducing the variance in smooth stochastic method can effectively improve the convergence of the algorithm. An algorithm, hybrid regularized mirror descent with reduced variance and weighted average (α-HRMDVR-W), is obtained by using weighted average and reduced variance for solving "L1 + L2 + Hinge" non-smooth strong convex optimization problem. The variance reduction strategies are utilized at each step of the iterative process, and the weighted average of the output mode is used. It is proved that the α-HRMDVR-W has optimal convergence rate and the convergence rate does not depend on the number of samples. Unlike the existing variance reduction methods, α-HRMDVR-W only uses a small portion of samples instead of the total samples to modify the gradient at each iteration. Experimental results show that α-HRMDVR-W reduces the variance and decreases CPU time.
作者 朱小辉 陶卿
出处 《模式识别与人工智能》 EI CSCD 北大核心 2016年第7期577-589,共13页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61273296)资助~~
关键词 机器学习 随机优化 减小方差 Machine Learning, Stochastic Optimization, Reduced Variance
  • 相关文献

参考文献2

二级参考文献20

  • 1Tseng P. Approximation accuracy, gradient methods, and error bound for structured convex optimization. Mathematical Programming, 2010,125(2):263-295. [doi: 10.1007/sl0107-010-0394-2].
  • 2Nemirovski A, Juditsky A, Lan G, Shapiro A. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 2009,19(4):1574-1609. [doi: 10.1137/070704277].
  • 3Shalev-Shwartz S, Tewari A. Stochastic methods for LI regularized loss minimization. In: Proc. of the 26th Annual Int’l Conf. on Machine Learning. 2009. 929-936.
  • 4Johnson R, Zhang T. Accelerating stochastic gradient descent using predictive variance reduction. In: Proc. of the Advances in Neural Information Processing Systems 26. 2013. 315-323.
  • 5Shalev-Shwartz S, Zhang T. Stochastic dual coordinate ascent methods for regularized loss minimization. arXiv preprint arXiv: 1209.1873,2012.
  • 6Le Roux N, Schmidt M, Bach F. A stochastic gradient method with an exponential convergence rate for strongly convex optimization with finite training sets. arXiv preprint, arXiv: 1202.6258, 2012.
  • 7Xiao L. Dual averaging methods for regularized stochastic learning and online optimization. In: Advances in Neural Information Processing Systems. 2009. 2116-2124.
  • 8Xiao L, Zhang T. A proximal stochastic gradient method with progressive variance reduction. arXiv: 1403.4699vl, 2014.
  • 9Duchi J, Shalev-Shwartz S, Singer Y, Tewari A. Composite objective mirror descent. In: Proc. of the 23rd Annual Workshop on Computational Learning Theory. ACM Press, 2010. 116-128.
  • 10Duchi J, Shalev-Shwartz S, Singer Y. Efficient projections onto the Ll-ball for learning in high dimensions. In: Proc. of the 25th Int’l Conf. on Machine Learning. 2008. 272-279.

共引文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部