期刊文献+

基于热度矩阵的微博热点话题发现 被引量:9

Microblog Hot Topics Detection Based on Heat Matrix
下载PDF
导出
摘要 现有微博热点话题发现模型对微博数量规模较敏感,发现速度较慢。为此,提出一种基于热度矩阵的主题模型。通过热度矩阵获取各潜在主题的热度和主题-词概率分布,并以词间的共有热度来挖掘其语义关系,进而准确识别数据中的热点话题及热点词汇。在真实微博数据上的实验结果表明,与潜在狄利克雷分布模型相比,该模型的效率和准确率较高,发现的热点话题与实时事件保持一致,具有较好的热点识别效果。 Existing methods or models of microblog hot topics detection are sensitive to the quantity and the scale of microblog,and the detection process is slow.Hence,this paper proposes a topic model based on heat matrix.It uses the heat matrix to obtain heat and the topic-word probability distribution of every latent topic,and uses the common heat of words to extract the semantic relationship between words.Then the hot topics and hot words can be identified accurately.Experimental results on real microblog show that,compared with Latent Dirichlet Allocation(LDA) model,the proposed model has higher efficiency and accuracy rate.It can detect the hot topics which are consistent with real-time events,so that it has better effect in hot spot identification.
出处 《计算机工程》 CAS CSCD 北大核心 2017年第2期57-62,共6页 Computer Engineering
基金 国家自然科学基金重点项目(U1135005)
关键词 热度矩阵 主题模型 微博 话题发现 文本挖掘 heat matrix topic model microblog topic detection text mining
  • 相关文献

参考文献5

二级参考文献66

  • 1[美]凯斯·桑斯坦著,黄维明译.《网络共和国-网络社会中的民主问题》[M].上海人民出版社,2003:47.
  • 2Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70.
  • 3Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137.
  • 4Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 5Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 6Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284.
  • 7Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
  • 8Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57.
  • 9Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983.
  • 10Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.

共引文献267

同被引文献99

引证文献9

二级引证文献40

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部