期刊文献+

基于可区分语言模型的微博在线话题检测 被引量:2

Online topic detection in microblogs based on discriminative language model
下载PDF
导出
摘要 针对微博话题检测中需要解决的高维数据、噪声信息以及话题的快速演化等主要问题,提出一个有效的微博在线话题检测模型——可区分语言模型(discriminative language model,DLM)。该模型首先选择微博数据的可区分特征子空间,接着利用一元语言模型实现微博话题的在线检测。实验表明,在MACRO_F1和AVG_CDET等指标上,DLM明显优于现有模型,DLM能准确及时发现微博话题。 To fulfill this task that should tackle several primary challenges in microblogs,such as high dimensional data,noise information,and rapid topic evolution. This paper proposed a novel online topic detection model for tweets,called DLM. DLM first selected a discriminative feature subset and then detected interesting topics with a unigram language model. Experimental results show that DLM clearly outperforms the state-of-the-art models in terms of both MACRO_F1and AVG_CDET.
出处 《计算机应用研究》 CSCD 北大核心 2014年第12期3539-3542,共4页 Application Research of Computers
基金 湖南省工业支撑计划重点项目(2012GK2006) 吉首大学校级科研基金资助项目(Jdzd12011)
关键词 话题检测 特征选择 微博 语言模型 可区分语言模型 topic detection feature selection microblog language model discriminative language model(DLM)
  • 相关文献

参考文献15

  • 1张晨逸,孙建伶,丁轶群.基于MB-LDA模型的微博主题挖掘[J].计算机研究与发展,2011,48(10):1795-1802. 被引量:167
  • 2DIAO Qi-ming, JIANG Jing, ZHU Fei-da, et al. Finding bursty topics from microblogs [ C ]//Proe of the 50th Annual Meeting of the Associa- tion for Computational Linguistics. Pennsylvania: Association for Computational Linguistics ,2012:536-544.
  • 3BUDAK C, AGRAWAL D,ABBADI A E. Structural trend analysis for online social networks [ J ]. Proceedings of the VLDB Endow- ment, 2011,4(i0) :646-656.
  • 4MATHIOUDAKIS M, KOUDAS N. TwitterMonitor: trend detection over the twitter stream [ C ]//Proc of the 29th Intemational Conference on Management of Data. New York :ACM Press,2010:1155-1158.
  • 5CATALDI M, CAROL D, SCHIFANELLA C. Emerging topic detec- tion on Twitter based on temporal and social terms evaluation [ C ]// Proc of the lOth IEEE International Workshop on Multimedia Data Mining. Piscataway: IEEE Press,2010 : 1-10.
  • 6ALLAN J,PAPKA R,LkVRENKO V. On-hne new evcm detection and tracking[ C ]//Proc of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York :ACM Press, 1998:37-45.
  • 7ALLAN J. Topic detection and tracking: event-based information or- ganization[ M ]. Dordrecht: Kluwer Academic Publishers ,2002.
  • 8ALLAN J, HARDING S, FISHER D, et al. Taking topic detection from evaluation to practice[ C ]//Proc of the 38th Annual Hawaii Interna- tional Conference on System Sciences. Piscataway:IEEE Press,2005.
  • 9TU Yi-ning,SENG Jia-lang. Indices of novelty for emerging topic de- tection[ J ]. Information Processing and Management, 2011,48 (2) :591-600.
  • 10MAKKONEN J, AHONEN-MYKA H, SALMENKIVI M. Simple se- mantics in topic detection and tracking[ J]. Information Retrieval, 2004,7(3 ) :347-368.

二级参考文献20

  • 1Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70.
  • 2Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137.
  • 3Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 4Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 5Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284.
  • 6Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
  • 7Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57.
  • 8Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983.
  • 9Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 10Wei X, Croft W B. LDA-based document models for ad hoc retrieval [C] //Proc of the 29th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York:ACM, 2006:178-185.

共引文献166

同被引文献32

引证文献2

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部