期刊文献+

面向大规模数据主题建模的方差减小的随机变分推理算法 被引量:1

Variance reduced stochastic variational inference algorithm for topic modeling of large-scale data
下载PDF
导出
摘要 随机变分推理(SVI)已被成功应用于在包括主题模型在内的众多类型的模型。虽然它将推理问题映射到涉及随机梯度的优化问题,使其扩展到处理大规模数据集,但是SVI算法中随机梯度固有的噪声使其产生较大的方差,阻碍了快速收敛。为此,对SVI作出改进,提出一种方差减小的SVI(VR-SVI)算法。首先,采取滑动窗口的方法重新计算随机梯度中的噪声项,构建新的随机梯度,减少了噪声对随机梯度的影响;然后,对提出的算法可在SVI基础上使得随机梯度的方差减小进行证明;最后,讨论窗口大小对算法的影响,并分析算法的收敛性。实验结果表明,VRSVI算法既减小了随机梯度的方差,又节省了计算时间,可达到快速收敛的效果。 Stochastic Variational Inference(SVI) has been successfully applied to many types of models including topic models. Although it is extended to deal with large-scale data set with mapping the problem of reasoning to the optimization problems involving random gradient, the inherent noise of the stochastic gradient in SVI algorithm makes it produce large variance, which hinders fast convergence. In order to solve the problem, an improved Variance Reduced SVI(VR-SVI) was proposed. Firstly, the sliding window method was used to recalculate the noise term in the stochastic gradient, a new stochastic gradient was constructed, and the influence of noise on the stochastic gradient was reduced. Then, it was proved that the proposed algorithm could reduce the variance of random gradient on the basis of SVI. Finally, the influence of window size on the algorithm was discussed, and the convergence of algorithm was analyzed. The experimental results show that, the proposed VR-SVI algorithm can not only reduce the variance of stochastic gradient, but also save the computation time and achieve fast convergence.
作者 刘张虎 程春玲 LIU Zhanghu , CHENG Chunling(School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing Jiangsu 210003, Chin)
出处 《计算机应用》 CSCD 北大核心 2018年第6期1675-1681,共7页 journal of Computer Applications
关键词 随机变分推理 滑动窗口 随机梯度 方差减小 主题建模 Stochastic Variational Inference (SVI) sliding window stochastic gradient variance reduction topic modeling
  • 相关文献

参考文献1

二级参考文献19

  • 1Tseng P. Approximation accuracy, gradient methods, and error bound for structured convex optimization. Mathematical Programming, 2010,125(2):263-295. [doi: 10.1007/sl0107-010-0394-2].
  • 2Nemirovski A, Juditsky A, Lan G, Shapiro A. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 2009,19(4):1574-1609. [doi: 10.1137/070704277].
  • 3Shalev-Shwartz S, Tewari A. Stochastic methods for LI regularized loss minimization. In: Proc. of the 26th Annual Int’l Conf. on Machine Learning. 2009. 929-936.
  • 4Johnson R, Zhang T. Accelerating stochastic gradient descent using predictive variance reduction. In: Proc. of the Advances in Neural Information Processing Systems 26. 2013. 315-323.
  • 5Shalev-Shwartz S, Zhang T. Stochastic dual coordinate ascent methods for regularized loss minimization. arXiv preprint arXiv: 1209.1873,2012.
  • 6Le Roux N, Schmidt M, Bach F. A stochastic gradient method with an exponential convergence rate for strongly convex optimization with finite training sets. arXiv preprint, arXiv: 1202.6258, 2012.
  • 7Xiao L. Dual averaging methods for regularized stochastic learning and online optimization. In: Advances in Neural Information Processing Systems. 2009. 2116-2124.
  • 8Xiao L, Zhang T. A proximal stochastic gradient method with progressive variance reduction. arXiv: 1403.4699vl, 2014.
  • 9Duchi J, Shalev-Shwartz S, Singer Y, Tewari A. Composite objective mirror descent. In: Proc. of the 23rd Annual Workshop on Computational Learning Theory. ACM Press, 2010. 116-128.
  • 10Duchi J, Shalev-Shwartz S, Singer Y. Efficient projections onto the Ll-ball for learning in high dimensions. In: Proc. of the 25th Int’l Conf. on Machine Learning. 2008. 272-279.

共引文献4

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部