期刊文献+

基于联合非负矩阵分解的话题变迁检测方法 被引量:1

Topic Change Detection Method Based on Joint Nonnegative Matrix Factorization
下载PDF
导出
摘要 在大规模时序文档集中,异同话题缺乏从时序文档集中识别跟踪分析话题随时间变迁的能力。为此,提出一种面向时序文档语料库的话题变迁检测方法。该方法从时序文档语料库中发现相似话题和异同话题。利用改进的联合非负矩阵分解算法,从多个数据集中提取话题集合。为避免引入噪声话题,计算所有话题的话题熵,以获取优质话题,并通过运用词云和趋势图来分析话题变迁趋势。在20Newsgroups和LTN2011数据集上的实验结果表明,该方法可以有效地从时序文档集中发现异同话题,且提取的话题效果好、准确率高。 In large-scale temporal documents similarities and differences do not have the ability to identily topics from temporal documents and to track and analyze topics over time. To this end, a method of topic change detection for temporal document corpus is proposed. Similar topics and similarities and foundations are found in the temporal document corpus. Using the improved joint Nonnegative Matrix Factorization (NMF) algorithm, similarities and differences were found in the the timeseries document. To avoid the introduction of noise topics, by calculating the topic of all topic entropy, in order to obtain high-quality topics. Use the word cloud and trend graph to analyze the trend of topic change. Experimental results of two real data sets, 20Newsgroups and LTN2011 show that this method can effectively find similarities and differences from the tempord of documents, and the extraction topic is effect and the accuracy is high.
出处 《计算机工程》 CAS CSCD 北大核心 2018年第1期35-43,共9页 Computer Engineering
基金 上海市科学技术委员会科研计划项目(16511102702) 上海市经济和信息化委员会项目(150643)
关键词 联合非负矩阵分解 话题模型 时序异同话题 优质话题 话题变迁检测 Joint Nonnegative Matrix Factorization (NMF) topic model temporal similarities and differences topic high quality topic topic change detection
  • 相关文献

参考文献4

二级参考文献59

  • 1Blei B D,Ng A,Jordan M I. Latent Dirichlet allocation [J]. Journal of Machine Learning Research, 2003 (3) :933 - 1022.
  • 2Xuerui Wang, Andrew MeCallum. Topic over time: A Non-Markov Continuous-Time Model of Topical Trends [ C ]//ACM SIGKDD - 2006,424 - 433.
  • 3David Hall, Daniel Jurafsky, Christopher D Manning. Studying the History of Ideas Using Topic Modeh [ C ]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008:363 - 371.
  • 4David M Blei, John D Lafferty. Dynamic Topic Models[ C]//Proceod- ings of the 23^rd International Conference on Machine Learning. 2006: 113 - 120.
  • 5Alsumait L, Barbara D, Domeniconi C. On-line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking[C]//ICDM,2008.
  • 6Griffiths T L, Steyvers M. Finding scientific topics [ C]//Proc Natl Acad Sci U S A, vol. 101 Suppl 1, 2004:5228-5235.
  • 7Mark S,Tom G. Probabilistic Topic Models[ M]//T Landauer, D Mc- Namara, S Dennis, et al. Latent Semantic Analysis: A Road to Meaning. 2006.
  • 8Xuan-Hieu Phan, Cam-Tu Nguyen. http://gibbslda, sourceforge, net/.
  • 9Mei Q ,Zhai C. Discovering evolutionary theme patterns from test: an exploration of temporal text mining[ C ]//Proceeding of the eleventh ACM SIGKDD international conference on Konwledge discovery in data mining, 2005.
  • 10Thomas Minka, John Lafferty. Expectation-Propagation for the Generative Aspect Model [ C ]//Uncertainty in Artificial Intelligence ( UAt), 2002.

共引文献40

同被引文献6

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部