期刊文献+

一种基于自适应重心向量的主题检测方法 被引量:2

Topic Detection Approach Based on Adaptive Center Vector
下载PDF
导出
摘要 针对影响主题检测性能的2个重要因素——相似主题的判定和主题漂移问题,提出一种基于自适应重心向量的主题检测方法。该方法将命名实体信息应用到特征表示上,将命名实体向量和关键词向量相结合表示主题的重心向量,以有效区分相似主题。采用增量聚类检测主题,在增量聚类过程中不断修正主题重心,以解决主题漂移的问题。实验结果与性能比较表明,该方法能有效提高主题检测的性能。 Similar topic detection and topic excursion are two important factors which affect the performance of topic detection. For these two problems, this paper proposes a topic detection approach based on adaptive center vector. By using information of name-entity in feature representation, it combines name-entity vector and keyword vector to construct topic center vector, which can detect similar topic efficiently. Based on the idea of single-pass clustering, the algorithm modifies topic center dynamically. Experimental results show that the algorithm can improve the performance of topic detection effectively.
出处 《计算机工程》 CAS CSCD 北大核心 2009年第3期80-82,共3页 Computer Engineering
基金 国家"863"计划基金资助项目(2007AA01Z439)
关键词 主题检测 主题漂移 命名实体 主题重心向量 topic detection topic excursion name-entity topic center vector
  • 相关文献

参考文献6

  • 1Allan J. Topic Detection and Tracking: Event-based Information Organization[M]. Boston: Kluwer Academic Publishers, 2002: 1241-1253.
  • 2Kumaran G, Allan J. Text Classification and Named Entities for New Event Detection[C]//Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. [S.l.]: ACM Press, 2004: 297-304.
  • 3Makkonen J. Investigations on Event Evolution in TDT[C]// Proceedings of Student Workshop of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Edmonton, Canada: [s. n.], 2003: 43-48.
  • 4Nallapati R, Feng Ao, Peng Fuchun. Event Threading Within News Topics[C]//Proceedings of International Conference on Information and Knowledge Management. Washington, USA: [s. n.], 2004: 446-453.
  • 5贾自艳,何清,张海俊,李嘉佑,史忠植.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280. 被引量:58
  • 6骆卫华,于满泉,许洪波,王斌,程学旗.基于多策略优化的分治多层聚类算法的话题发现研究[J].中文信息学报,2006,20(1):29-36. 被引量:38

二级参考文献13

  • 1R Papka.On-line new event detection,clustering,and tracking:[Ph D dissertation].MA:University of Massachusetts Amherst,1999
  • 2K Hui,W Lam.Automatic event generation from multi-lingual news stories.In:Proc of the First ACM/IEEE-CS Joint Conf on Digital Libraries.Roanoke,New York:ACM Press,2001.23~24
  • 3N Stokes,J Carthy,A F Smeaton.Segmenting broadcast news streams using lexical chaining.In:T Vidal,P Liberatore,eds.Proc of STAIRS 2002.Amsterdam:IOS Press,2002.145~154
  • 4D Randall.The Universal Journalist,Second Edition.London:Pluto Press,2000
  • 5S H Lin,M C Chen,J M Ho,et al.ACIRD:Intelligent Internet document organization and retrieval.IEEE Trans on Knowledge and Data Engineering,2002,14(3):599~613
  • 6G Salton,B Buckley.Term-weighting approaches in automatic text retrieval.Information Processing and Management,1998,24(5):513~523
  • 7骆卫华 刘群 程学旗 孙茂松 陈群秀.话题检测与跟踪技术的发展与研究[A].孙茂松,陈群秀.全国计算语言学联合学术会议(JSCL-2003)论文集[C].北京:清华大学出版社,2003.560-566.
  • 8Jonathan G. Fiscus, George R. Doddington. Topic Detection and Tracking Evaluation Overview[A]. In: James Allan.Topic Detection and Tracking, Event-based Infommtion Organization[C]. Norwell: Kluwer Academic Publishers,2002,17 - 31.
  • 9Y.Yang, T. Pierce, J. Carbonell. A Study on Retrospective and Online Event Detection[A]. In: W. Bruce Croft,Alistair Moffat,C. J.van Rijsbergen, et al. Proceedings of the 21th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98)[C]. New York: ACM Press, 1998, 28- 36.
  • 10Brants, T., Chen, F. R., Farahat, A. O. A system for new event detection[A].in: Charles Clarke, et al. Proceedings of SIGIR 2003, the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C]. New York: ACM Press,2003,330- 337.

共引文献87

同被引文献15

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部