期刊文献+

基于HDP模型的领域微博主题演化研究 被引量:2

Research on Domanial Microblog Topic Evolution Based on HDP Model
下载PDF
导出
摘要 领域微博中包含较多的专业领域信息,并且随时间表现出较强的演化性。为分析领域的主题演化情况,构建一个基于分层Dirichlet过程(HDP)的DM-HDP模型。以用户为单位抽取领域相关的微博,利用微博的领域特征和时间特征,提取领域相关带有明显时间特征的微博并自动挖掘其主题分布,最终构建领域主题演化分析过程。实验结果表明,基于DM-HDP模型的分析方法能够表现领域微博主题的演化过程,与基于LDA和HDP模型的方法相比,在内容困惑度和模型复杂度等方面均具有明显优势。 Domanial microblog contains much professional information that show a strong evolution over time.In order to analyze the topics of professional microblog automatically,a domanial microblog topic evolution method based on Hierarchical Dirichlet Process(HDP) model is built.Firstly,domain-related microblog is extracted with the individual user as the unit.Then,accurate extraction of domain-related microblog with distinct temporal features and automatic mining of its topics using domain features and temporal features.At last,the process of domanial topics evolution analysis is constructed.Experimental results show that the method based on the DM-HDP model can show the evolution of the field of microblog,and compared with the methods that based on the LDA and HDP model,it has obvious advantages in terms of content confusion and model complexity.
出处 《计算机工程》 CAS CSCD 北大核心 2018年第2期1-8,共8页 Computer Engineering
基金 国家自然科学基金(61163025) 内蒙古自治区自然科学基金(2015MS0621)
关键词 领域微博 主题挖掘 分层Dirichlet模型 DM-HDP模型 GIBBS采样 主题演化 domanial microblog topic mining hierarchical Dirichlet model DM-HDP model Gibbs sampling topic evolution
  • 相关文献

参考文献8

二级参考文献207

  • 1耿焕同,蔡庆生,于琨,赵鹏.一种基于词共现图的文档主题词自动抽取方法[J].南京大学学报(自然科学版),2006,42(2):156-162. 被引量:30
  • 2谭松波,王月粉.中文文本分类语料库-TanCorpv1.0[EB/OL].(2007-08-29)[2008-01-20].http://www.searehforum:org.cn/tansongbo/corpus.htm.
  • 3Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70.
  • 4Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137.
  • 5Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 6Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 7Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284.
  • 8Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
  • 9Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57.
  • 10Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983.

共引文献315

同被引文献13

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部