期刊文献+

周期分类和Single-Pass聚类相结合的话题识别与跟踪方法 被引量:28

A New Topic Detection and Tracking Approach Combining Periodic Classification and Single-Pass Clustering
下载PDF
导出
摘要 针对增量式聚类初始时话题模型不够充分和准确,随处理报道数量增加,误检与漏检的累积效应被放大的问题,提出了周期分类和Single-Pass聚类相结合的话题识别与跟踪方法.首先采用增量式聚类算法进行话题识别与跟踪,当新闻文本每积累到一定程度之后,对已经聚类的报道进行周期分类,使话题簇精度提高,从而提高后续话题识别与跟踪精度.实验表明这种方法是有效的,能够降低漏检率与错检率,减少归一化错误识别代价. For the insufficient model and accuracy of incremental cluster topic, the problems of miss alarm and false alarm may be increased due to the accumulate effects. The topic detection and tracking method of periodic classification and signle-pass cluster was proposed in this paper, the main ideal is to employ the incremental clustering algorithm to detect and track topic, When the every news text accumulate to a certain degree, the clustering reports were cycle classifyed to improve the accuracy of topic clusters, and follow-up to improve the accuracy of topic detection and tracking. The experiment results shown the effectivity of the method, which could decrease the probabilities of miss alarm and false alarm, then finally reducing the normalized detection cost.
出处 《北京交通大学学报》 CAS CSCD 北大核心 2009年第5期85-89,共5页 JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金 教育部科学技术研究重点项目资助(108126)
关键词 话题识别与跟踪 增量聚类 文本分类 k-最近邻方法分类 topic detection and tracking incremental clustering text categorization k-nearest neighbor classifier
  • 相关文献

参考文献9

  • 1Allan J, CarboneU J, Doddington G, et al. Topic Detection and Tracking Pilot Study: Final Report[C]//Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Virginia: Lansdowne, February, 1998:194 - 218.
  • 2李保利,俞士汶.话题识别与跟踪研究[J].计算机工程与应用,2003,39(17):7-10. 被引量:61
  • 3贾自艳,何清,张海俊,李嘉佑,史忠植.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280. 被引量:58
  • 4朱靖波,陈文亮,姚天顺.面向TDT的主题相似性计算模型[C]∥全国第七届计算语言学联合学术会议论文集,2003:476-481.
  • 5骆卫华,于满泉,许洪波,王斌,程学旗.基于多策略优化的分治多层聚类算法的话题发现研究[J].中文信息学报,2006,20(1):29-36. 被引量:38
  • 6张晓艳,王挺.基于多向量和实体模糊匹配的话题关联识别[C]∥第七届中文信息处理国际会议,2007:390-395.
  • 7Yang Y, Pierce T. A Study on Retrospective and On-Line Event Detection[ C]// Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, CMU, USA: ACM, 1998 : 28 - 36.
  • 8NIST. The 2002 Topic Detection and Tracking (TDT2002) Task Definition and Evaluation Plan[EB/OL]. (2002) [2008]. ftp: //jaguar. ncsl. nist. gov//tdt/ tdt2002/evalplans/TDT02. Eval. Plan. v1. 1. ps.
  • 9NIST. The 2004 Topic Detection and Tracking (TDT2004) Task Definition and Evaluation Plan version 1.1c[EB/OL]. (2004) [2008]. http://www. nist. gov.

二级参考文献21

  • 1James Allan,Jaime Carbonell,George Doddington et al.Topic Detection and Tracking Pilot Study:Final Report[C].In:Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop,San Francisco ,CA,Morgan Kaufmann Publishers ,Inc, 1998:194-218.
  • 2Yiming Yang,Jaime Carbonell,Ralf Brown et al.Learning Approaches for Detecting and Tracking News Events[J].IEEE Intelligent Systems:.Special Issue on Applications of Intelligent Information Retrieval,1999;14(4) :32-43.
  • 3Wayne C.Multilingual Topic Detection and Tracking:Successful Research Enabled by Corpora and Evaluation[C].In:Language Resources and Evaluation Conference (LREC),2000 : 1487-1494.
  • 4James Allan (ed.).Topic Detection and Tracking : Event-based Information Organization[M].Kluwer Academic Publishers,2002.
  • 5James Allan,Victor Lavrenko,Hubert Jin.First Story Detection in TDT is Hard[C].In:Proceedings of 9th Conference on Information Knowledge Management CIKM ,2000: 374---381.
  • 6Yiming Yang,Tom Ault,Thomas Pierce et al.Improving Text Categorization Methods for Event Tracking[C].In:Proeeedings of the 23rd International Conference on Research and Development in Information Retrieval ( SIGIR-2000),2000: 65-72.
  • 7Alvin Martin,George Doddington,Terri Kamm et al.The DET Curve in Assessment of Detection Task Performance[C].In:Proceedings of Eurospeech 1997,1997:1895-1898.
  • 8Ying-Ju Chen,Hsin-His Chen.NLP and IR Approaches to Monolingual and Multilingual Link Detection[C].In:Proceedings of the 19^th International Conference on Computational Linguistics(COLING 2002).
  • 9R Papka.On-line new event detection,clustering,and tracking:[Ph D dissertation].MA:University of Massachusetts Amherst,1999
  • 10K Hui,W Lam.Automatic event generation from multi-lingual news stories.In:Proc of the First ACM/IEEE-CS Joint Conf on Digital Libraries.Roanoke,New York:ACM Press,2001.23~24

共引文献139

同被引文献307

引证文献28

二级引证文献183

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部