期刊文献+

Thread Labeling for News Event

Thread Labeling for News Event
原文传递
导出
摘要 Automatic thread labeling for news events can help people know different aspects of a news event. In this paper, we present a method to label threads of a news event. We use latent Dirichlet allocation (LDA) topic model to extract news threads from news corpus. Our method first selects the thread words subset then extracts phrases based on co-occurrence calculation. The extracted phrase is then used as a label of a news thread. Experimental results show that about 60% of generated labels visualize the meaningful aspects of a news event. These labels can help people fast to capture many different aspects of a news event. Automatic thread labeling for news events can help people know different aspects of a news event. In this paper, we present a method to label threads of a news event. We use latent Dirichlet allocation (LDA) topic model to extract news threads from news corpus. Our method first selects the thread words subset then extracts phrases based on co-occurrence calculation. The extracted phrase is then used as a label of a news thread. Experimental results show that about 60% of generated labels visualize the meaningful aspects of a news event. These labels can help people fast to capture many different aspects of a news event.
作者 闫泽华 李芳
出处 《Journal of Shanghai Jiaotong university(Science)》 EI 2013年第4期418-424,共7页 上海交通大学学报(英文版)
基金 the National Natural Science Foundation of China(No.60873134)
关键词 计算机 信息处理 应用程序 CNNIC news event, topic labeling, latent Dirichlet allocation (LDA)
  • 相关文献

参考文献28

  • 1Cnnic. The 28th statistical report on the Internet development in China [R]. Beijing, China: CNNIC, 2011 (in Chinese).
  • 2Mei Q, Shen X, Zhai C. Automatic labeling of multinomial topic models [C]//Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Jose, California, USA: ACM, 2007: 490-499.
  • 3Blei D M, Ng A Y, Jordan M I, et al. Latent dirichlet allocation [J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 4Nallapati R, Feng A, Peng F, et al. Event threading within news topics [C]//Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management. Washington, DC, USA: ACM, 2004: 446-453.
  • 5Feng A, Allan J. Finding and linking incidents in news [C]//Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management. Lisboa, Portugal: ACM, 2007: 821-830.
  • 6Wang X, McCallum A. Topics over time: A non-Markov continuous-time model of topical trends [C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, PA, USA: ACM, 2006: 424-433.
  • 7Mei Q, Liu C, Su H, et al. A probabilistic approach to spatiotemporal theme pattern mining on weblogs [C]//Proceedings of the 15th International Conference on World Wide Web. Edinburgh, Scotland: ACM, 2006: 533-542.
  • 8Mei Q, Zhai C. A mixture model for contextual text mining [C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, PA, USA: ACM, 2006: 649-655.
  • 9Wang C, Zhang M, Ma S, et al. Automatic online news issue construction in web environment [C]//Proceeding of the 17th International Conference on World Wide Web. Beijing, China: ACM, 2008: 457-466.
  • 10Xu R, Peng W, Xu J, et al. On-line new event detection using time window strategy [C]//The Proceeding of International Conference on Machine Learning and Cybernetics (ICMLC). Guilin, China: IEEE, 2011: 1932-1937.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部