期刊文献+

一种适用于微博主题提取的SMLDA模型

An SMLDA model of topic extraction for micro-blog
下载PDF
导出
摘要 针对微博文本简短、格式内容散乱、信息描述不完全、数据噪声大导致无法高效获取微博主题的问题,提出一种基于LDA改进的SMLDA模型。该模型综合考虑微博作者之间的关系、微博特定主题的标签以及微博文本之间转发关系和背景主题,采用Gibbs抽样算法推导模型参数。在真实新浪微博数据集上进行试验分析,实验结果表明,SMLDA模型与LDA模型比较,前者效率更高,提取结果更准确。 Due to the short message,scattering format and content,incomplete description and data noise,the micro-blog topic can not be obtained efficiently.An improved SMLDA model based on LDA is proposed.The model mainly takes mi-cro-blog authors association,the specific topic tags,the relay document association and the background topic into considera-tion and adopts the Gibbs sampling algorithm to derive parameters.The experimental results on Sina micro-blog data set show that compared with LDA model,the SMLDA model is more effective.
出处 《桂林电子科技大学学报》 2015年第3期241-244,共4页 Journal of Guilin University of Electronic Technology
基金 国家863计划(2012AA011005)
关键词 新浪微博 Gibbs算法 主题提取 Sina micro-blog Gibbs algorithm topic extraction
  • 相关文献

参考文献13

  • 1Blei D,Ng A,Jordan M.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003(3):993-1022.
  • 2Hong Liangjie,Davison B.Empirical study of topic modeling in Twitter[C]//Proceedings of the First Workshop on Social Media Analytics.New York:ACM Press,2010:80-88.
  • 3Weng Jianshu,Lim Ee-Peng,Jiang Jing,et al.Twitterrank:finding topic-sensitive influential Twitterers[C]//Proceedings of the 3rd ACM International Conference on Web Search and Data Mining.New York:ACM Press,2010:261-270.
  • 4Zvi M,Griffiths T,Steyvers M,et al.The author-topic model for authors and documents[C]//Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence.Arlington:AUAI Press,2004:487-494.
  • 5Zhao W X,Jiang Jing,Weng Jianshu,et al.Comparing Twitter and traditional media using topic models[C]//Proceedings of the 33rd European Conference on Information Retrieval.Berlin,Heidelberg:Springer-Verlag,2011:338-349.
  • 6Ramage D,Dumais S,Liebling D.Characterizing micro blogs with topic models[C]//Proceedings of International AAAI Conference on Weblogs and Social Media.Menlo Park.CA:AAAI,2010:130-137.
  • 7张晨逸,孙建伶,丁轶群.基于MB-LDA模型的微博主题挖掘[J].计算机研究与发展,2011,48(10):1795-1802. 被引量:165
  • 8谢昊,江红.一种面向微博主题挖掘的改进LDA模型[J].华东师范大学学报(自然科学版),2013(6):93-101. 被引量:27
  • 9Philip R,Eric H.Gibbs sampling for the uninitiated[R].Technical Reports from UMIACS,2010.
  • 10Griffiths T,Steyvcrs M.Probabilistic topic models[C]//Latent Semantic Analysis:A Road to Meaning.Hillsdale.NJ:Laurence Erlbaum,2004:5221-5228.

二级参考文献39

  • 1Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70.
  • 2Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137.
  • 3Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 4Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 5Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284.
  • 6Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
  • 7Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57.
  • 8Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983.
  • 9Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 10Wei X, Croft W B. LDA-based document models for ad hoc retrieval [C] //Proc of the 29th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York:ACM, 2006:178-185.

共引文献175

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部