期刊文献+

基于局部和全局的LDA话题演化分析 被引量:3

LDA Topic Evolution Based on Global and Local Modeling
下载PDF
导出
摘要 对话题演化进行形式化描述,探讨了基于全局和局部话题演化的2种建模方式,并应用话题相似度和困惑度进行评测.对房地产话题和奥运会话题进行实例分析,给出了2种不同建模方法在话题演化方面的优缺点.两会报告实验结果表明,全局话题演化能够获得较好的模型参数,方法简单可靠;而局部话题演化则能产生细粒度话题,反映新话题的产生和旧话题的消亡. Topic evolution means the changes of contents and strength of a topic over time. This paper first gives the definition of topic evolution, describes two methods of topic evolution based on global and local documents. Two metrics of topic similarity and perplexity are used to evaluate both methods. The evolu tions of the two topics (the real estate vs. the 2008 Olympic games) are analyzed. The experiments on the recent five years of NPC&CPPCC news reports show that topic evolution based on global documents can get good topic model, the evolution method is easy, while topic evolution based on local documents can produce fine topics and show the arising of new topics and the vanishing of old topics.
作者 章建 李芳
出处 《上海交通大学学报》 EI CAS CSCD 北大核心 2012年第11期1753-1758,共6页 Journal of Shanghai Jiaotong University
基金 国家自然科学基金资助项目(60873134)
关键词 文字信息处理 狄利特利分布 话题关联和演化 text information processing latent dirichlet allocation topic detection and evolution
  • 相关文献

参考文献12

  • 1Makkonen J. Investigations on event evolution in TDT[C]∥Proceedings of HLT-NAACL 2003 Student Research Workshop. Edmonton:[s.n.], 2003: 43-48.
  • 2Blei D M, Ng A, Jordan M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 3单斌,李芳.基于LDA话题演化研究方法综述[J].中文信息学报,2010,24(6):43-49. 被引量:82
  • 4Blei D M, Lafferty J D. Dynamic topic models[C]∥Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh. PA, USA:[s.n.], 2006: 113-120.
  • 5Alsumait L, Barbara D, Domeniconi C. On-line LDA adaptive topic models of mining text streams with applications to topic detection and tracking[C]∥Proceedings of the 8th IEEE International Conference on Data Mining. Washington, DC, USA: IEEE Computer Society, 2008:3-12.
  • 6Wei X, Sun J, Wang X. Dynamic mixture models for multiple time series[C]∥Proceedings of the 20th International Joint Conference on Artificial Intelligent. Hyderabad, India:[s.n.], 2007: 2909-2914.
  • 7Wang C, Blei D, Heckerman D. Continuous time dynamic topic models[C]∥Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. Helsinki, Finland:[s.n.], 2008: 579-586.
  • 8Hall D, Jurafsky D, Manning C D. Studying the history of ideas using topic models[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii:[s.n.], 2008: 363-371.
  • 9Griffiths T L, Steyvers M. Finding scientific topics[J]. Proceedings of the National Academy of Sciences of the Universitates of America, 2004, 101: 5228-5235.
  • 10楚克明,李芳.基于LDA话题关联的话题演化[J].上海交通大学学报,2010,44(11):1496-1500. 被引量:20

二级参考文献31

  • 1Thomas Hofmann. Probabilistic latent semantic indexing[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, USA, 1999,50-57.
  • 2David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent dirichlet allocation[J]. The Journal of Machine Learning Research,2003,3: 993-1022.
  • 3T. Griffiths,M. Steyvers. A probabilistic approach to semantic representation [C]//Proceedings of the 24th Annual Conference of the Congnitive Science Society. Mahwah, NJ : Erlbaum, 2002,381-386.
  • 4M. Steyvers,T. Griffiths. Probabilistic topic models In: T. Landauer, D. S. McNamara, S. Dennis, W Kintsch (Eds.), handbook of Latent Semantic Analysis[M]. Hillsdale, NJ.. Erlbaum. 2007.
  • 5X. Wang, A. McCallum. Topic over time: A non-mark ov continuous-time model of topical trends[C]//Pro ceedings of the 12th ACM SIGKDD International Con ference on Knowledge Discovery and Data Mining Philadelphia, PA, USA, 2006: 424-433.
  • 6D. HalI,D. Jurafsky,C. D. Manning. Studying the history of ideas using topic models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii, 2008,363-371.
  • 7D. M. Blei,J. D. Lafferty. Dynamic topic model[C]// Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, Pennsylvania, 2006 : 113-120.
  • 8L. Alsumait,D. Barbara,C. Domeniconi. On-line LDA : Adaptive topic models of mining text streams with applications to topic detection and tracking[C]//Proceeding of the 8th IEEE International Conference on Data Mining. Washington,DC, USA : IEEE Computer Society,2008:3-12.
  • 9楚克明.基于LDA新闻话题的演化[C]//第五届全国信息检索学术会议.上海,中国,2009:64-72.
  • 10A. Gohr, A. Hinnerburg, R. Schult, M. Spiliopoulou. Topic evolution in a stream of documents[C]//Proceeding of the Society for Industrial and Applied Mathematics. 2009 : 859-870.

共引文献92

同被引文献22

引证文献3

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部