期刊文献+

新闻数据流的在线事件检测 被引量:1

Online event detection in news stream
下载PDF
导出
摘要 针对新闻数据流事件检测算法在实时性、准确率等方面存在的问题,提出一种面向新闻数据流的在线事件检测方法.事件的发生往往伴随着构成该事件的特征(即关键词)在相应时间段内出现的频率明显上升,将这些特征称为突发特征.运用分布拟合检验检测构成新闻数据流的特征在某一时间段内新闻报道中出现频率的分布是否发生明显变化,并进一步利用左边检验确认该时间段内的所有突发特征.分析突发特征的相关性,采用进化谱聚类算法将相关性较高的突发特征聚类在一起构成事件.在路透社新闻数据集第一卷上应用了本算法,验证了该方法能够有效地发现突发特征,并实时地检测出发生的事件,检测出的事件同实际事件有很高的符合度. Event detection in news stream is an important research area in topic detection and tracking community.Unfortunately,most of the existing event detection methods are offline and inaccurate.An online event detection algorithm in news stream was introduced.An event consists of a set of bursty features that demonstrates bursty rises in corresponding keywords frequency as the related events emerge.Goodness-of-fit test was applied to find out these features with obvious changes in distribution of term frequency in a news document.Left side significance test was further used to validate all the bursty features occurred in a time span.Finally,evolutionary spectral clustering was applied to group highly correlated bursty features into bursty events.Experiments on the Reuters Corpus Volume 1 show that the proposed method can effectively identify bursty features and timely detect events.The detected events are consistent with corresponding events in real life.
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2011年第6期1006-1012,共7页 Journal of Zhejiang University:Engineering Science
基金 国家科技支撑计划资助项目(2008BAH26B00)
关键词 在线事件检测 进化谱聚类 假设检验 新闻数据流 online event detection evolutionary spectral clustering hypothesis test news stream
  • 相关文献

参考文献20

  • 1第25次中国互联网络发展状况统计报告[R].北京:中国互联网络信息中心,2010:18—19,43—44.
  • 2Topic detection and tracking evaluation project [EB/ OL]. 2003-09-08. http://www, itl. nist. gov/iad/mig// tests/tdt/.
  • 3ALLAN J, PAPKA R, LAVERENKO V. Online new event detection and tracking[C] // Proceeding 21st Annu- al International ACM SIGIR Conference. New York: ACM, 1998: 37-45.
  • 4YANG Y, PIERCE T, CARBONELL J. A study on retrospective and online event detection[C],//Proceeding 21st Annual International ACM SIGIR Conference. New York: ACM, 1998: 28-36.
  • 5LAM W, MENG H, WONG K, et al. Using contextual analysis for news event detection [J]. International Journal of Intelligent Systems, 2001, 16(4) .. 525 - 546.
  • 6YANG Y, ZHANG J, CARBONELL J, et al. Topic conditioned novelty detection[C]// Proceeding of the 8th ACM SIGKDD International Conference. New York: ACM, 2002:688 - 693.
  • 7KUMARAN G, ALLAN J. Text classification and named entities for new event detection[C]// Proceeding 27st annual International ACM SIGIR Conference. New York: ACM, 2004:297 - 304.
  • 8张阔,李涓子,吴刚,王克宏.基于词元再评估的新事件检测模型[J].软件学报,2008,19(4):817-828. 被引量:17
  • 9ZHANG K, LI J, WU G. New event detection based on indexing-tree and name entity[C]// Proceeding of 30st Annual International ACM SIGIR Conference. New York: ACM, 2007:215-222.
  • 10HE Q, CHANG K, LIM E. Analyzing feature trajec- tories for event detection[C]//Proceeding of 30st Annu- al International ACM SIGIR Conference. New YorkACM, 2007: 207-214.

二级参考文献3

共引文献22

同被引文献5

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部