期刊文献+

基于评论树的微博社区热门话题检测方法 被引量:4

Hot topic detection method on micro-blog based on comments tree
下载PDF
导出
摘要 首先在分析微博文本特点的基础上设计了一种垃圾微博的过滤算法;针对微博数据稀疏性这一问题,利用社区内部联系紧密的特性,提出了微博评论树的概念和一种话题热度评价模型。最后基于以上两点提出了一种微博社区热门话题检测方法。真实数据集上的实验表明了过滤的必要性和所提出的微博社区热门话题检测方法的有效性。 Firstly,this paper analyzed the characteristics of the micro-blog text and designed a filtering algorithm of garbage micro-blog. Then,in order to solve the problem of data sparsity,taking full advantage of the feature of community tightness,this paper proposed the concept of micro-blog comments tree and an evaluation model of hot topic. Finally,based on the two points above,it proposed a hot topic detection method on micro-blog community. Experiments on real data sets show that the necessity of the filtering algorithm and the validity of hot topic evaluation model and hot topic detection method.
出处 《计算机应用研究》 CSCD 北大核心 2014年第12期3776-3779,3827,共5页 Application Research of Computers
基金 国家"863"计划资助项目(2011AA010603 2011AA010605)
关键词 微博社区 热门话题 过滤 评论树 话题热度评价模型 micro-blog community hot topic filtering comments tree hot topic evaluation model
  • 相关文献

参考文献13

  • 1中国互联网络信息中心.第31次中国互联网络发展状况统计报告[R].北京,2013.
  • 2ALLAN J, LAVRENKO V, FREY D, et al. UMass at TDT 2000 [ C]//Proc of Topic Detection and Tracking Workshop. [ S. 1. ] : Na- tional Institute of Standar and Technology, 2000 : 109-115.
  • 3YANG Y, PIERCE T, CARBONELL J. A study on retrospective and on-line event detection[ C]//Proc of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 1998 : 28-36.
  • 4WALLS F, JIN H, SISTA S, et al. Topic detection in broadcast news [ C]//Proc of DARPA Broadcast News Workshop. 1999 : 193-198.
  • 5贾自艳,何清,张海俊,李嘉佑,史忠植.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280. 被引量:58
  • 6张欣.中文Blog热门话题检测技术研究[J].软件导刊,2011,10(9):6-9. 被引量:1
  • 7丁伟莉,赵华,郑德权,等.中文Bolg热门话题检测与排序技术研究[c].见:中国中文信息学会二十五周年学术会议论文集.北京:中国中文信息学会,2006:282-289.
  • 8SHARFI B, HUTTON M A, KALITA J. Summarizing microblogs with topic models[ C ]//Proc of NAACL-Hlt. 2010:685-688.
  • 9GUO Jing, ZHANG Peng, GUO Li. Mining hot topics from Twitter streams[ J]. Procedia Computer Science, 2012, 9: 2008-2011.
  • 10王琳,冯时,徐伟丽,杨卓,王大玲,张一飞.一种面向微博客文本流的噪音判别与内容相似性双重检测的过滤方法[J].计算机应用与软件,2012,29(8):25-29. 被引量:15

二级参考文献58

  • 1杨小明,罗云.ISODATA算法的实现与分析[J].采矿技术,2006,6(2):66-66. 被引量:10
  • 2Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70.
  • 3Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137.
  • 4Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 5Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 6Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284.
  • 7Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
  • 8Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57.
  • 9Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983.
  • 10Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.

共引文献311

同被引文献47

引证文献4

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部