期刊文献+

CGRMB-LDA:面向隐式微博的主题挖掘 被引量:3

CGRMB-LDA:Topic mining for microblog oriented to implict microblog
下载PDF
导出
摘要 由于微博文本短、词量少、语法风格随意的特点,因此微博中包含大量因缺少主题词汇而无法分析话题归属的微博,即隐式微博。提出改进的基于LDA的生成模型考虑评论组和转发微博的CGRMB-LDA模型,利用微博间评论关系、转发关系和上下文关系扩展隐式微博,明确隐式微博的主题归属,采用吉布斯采样的方法来求解模型从而得到主题集和微博所属主题。在真实数据集上的实验表明,CGRMB-LDA模型能有效地对微博特别是隐式微博进行主题挖掘。 Microblog is too short and grammatically casual so many microblogs can not be analyzed and divided into topics for lack of theme words,which are called implict microblogs.This paper proposed Commnet Group-Retransmission Microblog( CGRMB)-Latent Dirichlet Allocation( LDA) model which can explicitly divide implicit microblogs to topics considering comment group and retransmission relationship,using comment,retransmission and context relationship in microblogs to expand implicit microblog,and using Gibbs sampling in order to get theme sets and their belonged microblog topics.Experimental results on actual dataset show that CGRMB-LDA model can effectively mine the topics of microblogs.
出处 《计算机应用》 CSCD 北大核心 2016年第A01期67-71,共5页 journal of Computer Applications
关键词 微博 主题挖掘 评论组 转发微博 潜在Dirichlet分配 隐式微博 microblog topic mining comment group retransmission microblog Latent Dirichlet Allocation(LDA) implict microblog
  • 相关文献

参考文献14

  • 1KWAK H, LEE C, PARK H, et al. What is Twitter, a social net- work or a news media?[ C]//WWW'I0: Proceedings of the 19th In- ternational Conference on World Wide Web. New York: ACM, 2010:591-600.
  • 2谢昊,江红.一种面向微博主题挖掘的改进LDA模型[J].华东师范大学学报(自然科学版),2013(6):93-101. 被引量:27
  • 3唐晓波,王洪艳.基于潜在语义分析的微博主题挖掘模型研究[J].图书情报工作,2012,56(24):114-119. 被引量:31
  • 4YANG Y, CARBONELL J, BROWN R, et al. Multi-strategy learn- ing for topic detection and tracking [ C ]// Topic Detection and Tracking, the Information Retrieval Series 12. Heidelberg: Springer Berlin, 2002: 85-114.
  • 5ALLAN J, LAVRENKO V, SWAN R. Explorations within topic tracking and detection[ C]//Topic Detection and Tracking, the In- formation Retrieval Series 12. Heidelberg: Springer Berlin, 2002: 197 - 224.
  • 6BLEI D M, NG A Y, JORDAN M I. Latent Diriehlet allocation[ J]. Journal of Machine Learning Research, 2003, 3:993 - 1022.
  • 7LU Y, ZHAI C. Opinion integration through semi-supervised topic modeling[ C]//Proceedings of the 17th International Conference on World Wide Web. New York: ACM, 2008:121 - 130.
  • 8ROSEN-ZYI M, GRIFFITHS T, STEYVERS M, et al. The author- topic model for authors and documents[ C]// UAI'04: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. Ar- lington: AUAI Press, 2004:487 -494.
  • 9ZHAO W X, JIANG J, WENG J, et al. Comparing twitter and tra- ditional media using topic models[ C]// Advances in Information Retrleval, LNCS 6611. Heidelberg: Springer Berlin, 2011: 338- 349.
  • 10CANCHO R F I, SOLIR V'. Zipfs law and random texts[ J]. Ad- vances in Complex Systems, 2002, 5(1): 1 -6.

二级参考文献49

  • 1Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70.
  • 2Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137.
  • 3Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 4Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 5Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284.
  • 6Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
  • 7Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57.
  • 8Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983.
  • 9Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 10Wei X, Croft W B. LDA-based document models for ad hoc retrieval [C] //Proc of the 29th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York:ACM, 2006:178-185.

共引文献197

同被引文献22

引证文献3

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部