Tuning the Learning Rate for Stochastic Variational Inference

Tuning the Learning Rate for Stochastic Variational Inference

导出

摘要 Stochastic variational inference （SVI） can learn topic models with very big corpora. It optimizes the variational objective by using the stochastic natural gradient algorithm with a decreasing learning rate. This rate is crucial for SVI; however, it is often tuned by hand in real applications. To address this, we develop a novel algorithm, which tunes the learning rate of each iteration adaptively. The proposed algorithm uses the Kullback-Leibler （KL） divergence to measure the similarity between the variational distribution with noisy update and that with batch update, and then optimizes the learning rates by minimizing the KL divergence. We apply our algorithm to two representative topic models： latent Dirichlet allocation and hierarchical Dirichlet process. Experimental results indicate that our algorithm performs better and converges faster than commonly used learning rates. Stochastic variational inference （SVI） can learn topic models with very big corpora. It optimizes the variational objective by using the stochastic natural gradient algorithm with a decreasing learning rate. This rate is crucial for SVI; however, it is often tuned by hand in real applications. To address this, we develop a novel algorithm, which tunes the learning rate of each iteration adaptively. The proposed algorithm uses the Kullback-Leibler （KL） divergence to measure the similarity between the variational distribution with noisy update and that with batch update, and then optimizes the learning rates by minimizing the KL divergence. We apply our algorithm to two representative topic models： latent Dirichlet allocation and hierarchical Dirichlet process. Experimental results indicate that our algorithm performs better and converges faster than commonly used learning rates.

作者 Xi-Ming Li Ji-Hong Ouyang

机构地区 College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第2期428-436,共9页 计算机科学技术学报（英文版）

基金 This work was supported by the National Natural Science Foundation of China under Grant Nos. 61170092, 61133011 and 61103091.

关键词 stochastic variational inference online learning adaptive learning rate topic model stochastic variational inference, online learning, adaptive learning rate, topic model

分类号 O211.6 [理学—概率论与数理统计] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献1

1李鹏,王斌,晋薇.Improving Web Document Clustering through Employing User-Related Tag Expansion Techniques[J].Journal of Computer Science & Technology,2012,27(3):554-566. 被引量：5

二级参考文献29

1Hotho A, Staab S, Stumme G. Wordnet improves text docu- ment clustering. In Proc. SIGIR 2003 Semantic Web Work- shop, Toronto, Canada, Aug. 1, 2003.
2Hu J, Fang L, Cao Y, Zeng H J, Li H, Yang Q, Chen Z. En- hancing text clustering by leveraging Wikipedia semantics. In Proc. SIGIR 2008, Singapore, Jul. 20-24, 2008, pp.179-186.
3Heymann P, Koutrika G, Garcia-Molina H. Can social book- marking improve web search? In Proc. WSDM2008, PaloAlto, USA, Feb. 11-12, 2008, pp.195-206.
4Ramage D, Heymann P, Manning C D, Garcia-Molina H. Clustering the tagged web. In Proc. WSDM2009, Barcelona, Spain, Feb. 9-12, 2009, pp.54-63.
5http: / /www.dai-labor.de/en/ competence_centers/ irml/ data- sets/, April 2010.
6Li X, Guo L, Zhao Y E. Tag-based social interest discovery. In Proc. WWW2008, Beijing, China, Apr. 21-25, 2008, pp.675- 684.
7Wetzker R, Zimmermann C, Bauckhage C. Analyzing so- cial bookmaxking systems: A del.icio.us cookbook. In Proc. ECAI 2008 Mining Social Data Workshop, Patras, Greece, Jul. 21-25, 2008, pp.26-30.
8Griffiths T L, Steyvers M. Finding scientific topics. In Proc. National Academy of Sciences, 2004, 101(Suppl.1): 5228- 5235.
9Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
10Lu C, Chen X, Park E K. Exploit the tripartite network of social tagging for web clustering. In Proc. CIKM2009, Hong Kong, China, Nov. 2-6, 2009, pp.1545-1548.

共引文献4

1顾晓雪,章成志.结合内容和标签的Web文本聚类研究[J].现代图书情报技术,2014(11):45-52. 被引量：8
2黄兴,刘小青,曹步清,唐明董,刘建勋.融合K-Means与Agnes的Mashup服务聚类方法[J].小型微型计算机系统,2015,36(11):2492-2497. 被引量：8
3张旭,孙玉伟,成颖.不同特征对文本聚类效果的比较研究——以新闻文本为例[J].情报理论与实践,2020,43(1):169-176. 被引量：9
4李为康,杨小兵.一种改进的RFM模型在网店客户细分中的应用[J].中国计量大学学报,2020,31(1):85-91. 被引量：4

1柴变芳,赵晓鹏.大规模网络广义社区发现随机变分推理算法[J].济南大学学报（自然科学版）,2016,30(5):334-340.
2刘忠义,张晓薇,陈向群.一种速度自适应的无线传感网络目标跟踪算法[J].华中科技大学学报（自然科学版）,2005,33(z1):335-338. 被引量：2
3王彩勋,高卓玛.一类具扩散的SVI传染病模型的全局渐进稳定性[J].青海大学学报（自然科学版）,2014,32(6):82-86.
4李家荣.新型速度自适应磁链观测器在矢量控制中的应用[J].防爆电机,2005,40(6):32-36. 被引量：1
5刘剑,吴成东,李敏,朱德文.电梯运行速度自适应神经元控制研究[J].计算机自动测量与控制,2001,9(1):26-28. 被引量：4
6赵泓,何花,张志广.用U.L.N.神经网络实现的滤波器[J].小型微型计算机系统,2000,21(4):383-384.
7王德明.速度自适应模型跟随控制在交流调速中的应用[J].江苏理工大学学报（自然科学版）,1998,19(5):34-39.
8傅晓云,郭华明,李宝仁.基于模型参考的双电机驱动自适应控制研究[J].中国制造业信息化（学术版）,2005,34(4):126-128.
9Zhang Qinjuan Wu Muqing Zhen Yan.Speed aware self-adapting and reconfigurable routing mechanism in MANET[J].High Technology Letters,2012,18(3):321-325.
10程龙乐,许金林,李皙茹,马祖长,李晓风.基于图像处理的跑步机速度自适应技术研究[J].计算机技术与发展,2016,26(10):92-94. 被引量：3

Journal of Computer Science & Technology

2016年第2期

浏览历史

内容加载中请稍等...

Tuning the Learning Rate for Stochastic Variational Inference

参考文献1

二级参考文献29

共引文献4

相关作者

相关机构

相关主题

浏览历史