期刊文献+

基于随机步长具有最优瞬时收敛速率的稀疏随机优化算法

Sparse Stochastic Optimization Algorithm with Optimal Individual Convergence Rate Based on Random Step-Size
下载PDF
导出
摘要 几乎所有的稀疏随机算法都来源于在线形式,只能获得平均输出方式的收敛速率,对于强凸优化问题无法达到最优的瞬时收敛速率.文中避开在线形式转到随机模式,直接研究随机优化算法.首先在含有L1正则化项的稀疏优化问题中加入L2正则化项,使之具有强凸特性.然后将黑箱优化方法中的随机步长策略引入到当前通用的结构优化算法COMID中,得到基于随机步长的混合正则化镜面下降稀疏随机优化算法.最后通过分析L1正则化问题中软阈值方法的求解特点,证明算法具有最优的瞬时收敛速率.实验表明,文中算法的稀疏性优于COMID. Almost all sparse stochastic algorithms are developed from the online setting, and only the convergence rate of average output can be obtained. The optimal rate for strongly convex optimization problems can not be reached as well. The stochastic optimization algorithms are directly studied instead of the online to batch conversation in this paper. Firstly, by incorporating the L2 regularizer into the Ll-regularized sparse optimization problems, the strong convexity can be obtained. Then, by introducing the random step-size strategy from the black-box optimization method to the state-of-the-art algorithm-composite objective mirror descent (COMID) , a sparse stochastic optimization algorithm based introducing on random step-size hybrid regularized mirror descent (RS-HMD) is achieved. Finally, based on the analysis of characteristics of soft threshold methods in solving the L1-regularized problem, the optimal individual convergence rate is proved. Experimental results demonstrate that sparsity of RS- HMD is better than that of COMID.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2015年第10期876-885,共10页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61273296) 安徽省自然科学基金青年项目(No.1508085QF114 1308085QF121)资助
关键词 机器学习 随机优化 最优瞬时收敛速率 稀疏性 Machine Learning, Stochastic Optimization, Optimal Individual Convergence Rate, Sparsity
  • 相关文献

参考文献18

  • 1Wang J, Tao Q. Machine Learning: The State of The Art. IEEE Intelligent Systems, 2008, 23(6): 49-55.
  • 2陶卿,高乾坤,姜纪远,储德军.稀疏学习优化问题的求解综述[J].软件学报,2013,24(11):2498-2507. 被引量:22
  • 3Shalev-Shwartz S, Singer Y, Srebro N, et al. Pegasos: Primal Esti- mated Sub-gradient Solver for SVM. Mathematical Programming, 2011, 127(1) : 3-30.
  • 4Duchi J, Shalev-Shwartz S, Singer Y, et al. Composite Objective Mirror Descent//Proc of the 23rd Annual Workshop on Computa- tional Learning Theory. New York, USA, 2010:116-128.
  • 5Xiao L. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization. Journal of Machine Learning Research, 2010, 11:2543-2596.
  • 6Zinkevich M. Online Convex Programming and Generalized Infinite- simal Gradient Ascent// Proe of the 20th International Conference on Machine Learning. New York, USA, 2003 : 928-936.
  • 7Rakhlin A, Shamir O, Sridharan K. Making Gradient Descent Opti- mal for Strongly Convex Stochastic Optimization//Proc of the 29th International Conference on Machine Learning. Edinburgh, UK, 2012 : 449-456.
  • 8Shamir O, Zhang T. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes //Proc of the 30th International Conference on Machine Learning. New York, USA, 2013:71-79.
  • 9Lacoste-Julien S, Schmidt M, Bach F. A Simpler Approach to Ob- taining an O(l/t) Convergence Rate for Projected Stochastic Sub- gradient Descent [ EB/OL ]. [ 2014 - 12 - 20 ]. http ://arxiv. org/ pdf./1212. 2002v2. pdf.
  • 10Hazan E, Kale S. Beyond the Regret Minimization Barrier: Opti- mal Algorithms for Stochastic Strongly-Convex Optimization. Jour- nal of Machine Learning Research, 2014, 15 ( 1 ) : 2489-2512.

二级参考文献43

  • 1Vapnik VN. Statistical Learning Theory. New York: Wiley-Interscience, 1998.
  • 2Zhang T. Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics, 2004,32(l):56-85. [doi: 10.1214/aos/1079120130].
  • 3Zhang T. Statistical analysis of some multi-category large margin classification methods. Journal of Machine Learning Research, 2004,5:1225-1251.
  • 4Wang J, Tao Q. Machine learning: The state of the art. IEEE Intelligent Systems, 2008,23(6):49-55. [doi: 10.1109/MIS.2008.107].
  • 5Bennett KP, Parrado-Hemandez E. The interplay of optimization and machine learning research. Journal of Machine Learning Research, 2006,7:1265-1281.
  • 6Tibshirani R. Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society (Series B), 1996,58(l):267-288.
  • 7Nesterov Y. Primal-Dual subgradient methods for convex problems. Mathematical Programming, 2009,120(l):221-259. [doi: 10. 1007/sl0107-007-0149-x].
  • 8Bertsekas DP, Nedic A, Ozdaglar AE. Convex Analysis and Optimization. Belmont: Athena Scientific, 2003.
  • 9Zinkevich M. Online convex programming and generalized infinitesimal gradient ascent. In: Proc. of the Int’l Conf. on Machine Learning. 2003. 928-936.
  • 10Shalev-Shwartz S, Singer Y, Srebro N. Pegasos: Primal estimated sub-gradient solver for SVM. In: Proc. of the Int’l Conf. on Machine Learning. 2007. 807-814. [doi: 10.1145/1273496.1273598].

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部