期刊文献+

基于改进的在线LDA模型的主题演化分析 被引量:15

Topic evolution analysis based on improved online LDA model
下载PDF
导出
摘要 为了解决OLDA模型中的主题混合和新主题不能及时发现的问题,基于OLDA模型提出一种改进的在线LDA模型(improved online LDA,IOLDA)。该模型根据主题强度为每个主题设置不同的遗传度,提出一种新的主题强度度量方法,根据文档-主题分布的集中程度为文档设置不同的权值,该方法可以有效降低宽泛主题的强度得分;利用模型主题对齐的特点,采用Jensen-Shannon距离横向计算话题间的关联。实验结果表明:本文提出的方法能够有效地在线分析主题的演化。 To sove the problem of topic mixing and new topic untimely detection in the traditional OLDA, an improved online LDA(IOLDA) model was presented based on OLDA. The different heritability for each topic was set according to the topic intensity. Furthermore, a new method was introduced to evaluate topic intensity. By calculating a weight for each document according to the concentration of the mixture distribution over topics, this method can effectively reduce the score of broad topics. Since the model is able to align topics across the epochs, topic association can be captured easily via Jensen-Shanuon Divergence. The results show that the proposed method is efficient for analyzing topic evolution online.
出处 《中南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2015年第2期547-553,共7页 Journal of Central South University:Science and Technology
基金 国家科技支撑计划项目(2012BAH18B05) 国家自然科学基金资助项目(61272447)~~
关键词 主题演化 主题遗传 主题强度 LDA模型 topic evolution topic genetic topic intensity LDA model
  • 相关文献

参考文献14

二级参考文献150

  • 1于满泉,骆卫华,许洪波,白硕.话题识别与跟踪中的层次化话题识别技术研究[J].计算机研究与发展,2006,43(3):489-495. 被引量:49
  • 2孟涛,王继民,闫宏飞.网页变化与增量搜集技术[J].软件学报,2006,17(5):1051-1067. 被引量:22
  • 3徐晓日.网络舆情事件的应急处理研究[J].华北电力大学学报(社会科学版),2007(1):89-93. 被引量:141
  • 4石晶,戴国忠.基于PLSA模型的文本分割[J].计算机研究与发展,2007,44(2):242-248. 被引量:25
  • 5Thomas Hofmann. Probabilistic latent semantic indexing[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, USA, 1999,50-57.
  • 6David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent dirichlet allocation[J]. The Journal of Machine Learning Research,2003,3: 993-1022.
  • 7T. Griffiths,M. Steyvers. A probabilistic approach to semantic representation [C]//Proceedings of the 24th Annual Conference of the Congnitive Science Society. Mahwah, NJ : Erlbaum, 2002,381-386.
  • 8M. Steyvers,T. Griffiths. Probabilistic topic models In: T. Landauer, D. S. McNamara, S. Dennis, W Kintsch (Eds.), handbook of Latent Semantic Analysis[M]. Hillsdale, NJ.. Erlbaum. 2007.
  • 9X. Wang, A. McCallum. Topic over time: A non-mark ov continuous-time model of topical trends[C]//Pro ceedings of the 12th ACM SIGKDD International Con ference on Knowledge Discovery and Data Mining Philadelphia, PA, USA, 2006: 424-433.
  • 10D. HalI,D. Jurafsky,C. D. Manning. Studying the history of ideas using topic models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii, 2008,363-371.

共引文献401

同被引文献236

引证文献15

二级引证文献117

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部