期刊文献+

基于动态共现的中文话题关联检测 被引量:1

CHINESE STORY LINK DETECTION BASED ON DYNAMIC CO-OCCURRENCE
下载PDF
导出
摘要 话题关联检测是话题检测与跟踪的一项子任务,是判断随机抽取的两篇新闻报道是否讨论同一个话题的技术。受词语共现模型的启发,结合话题关联检测的特点,提出了词语间的动态同现关系,实现了基于动态共现关系的报道相似度计算方法;探讨了相似度计算方法在中文话题关联检测中的应用。通过实验可知,动态共现关系可以在一定程度上反映报道的语义信息,相似度计算方法很好地改善了中文话题关联检测系统的性能,取得了不错的效果。 Story link detection is a subtask of topic detection and tracking.It is a technology to judge whether two randomly selected news stories are discussing a same event.Motivated by the word co-occurrence model,by integrating characteristics of story link detection,the paper proposes a dynamic co-occurrence relationship among words and realizes a story similarity computation method based on dynamic co-occurrence.Then the application of the similarity computation method to Chinese story link detection is discussed.Experimental results show that dynamic co-occurrence can express the semantic information of a story to a certain degree.The similarity computation method improves a lot the performance of the Chinese story link detection system.There have been good feedbacks.
作者 庞海杰
出处 《计算机应用与软件》 CSCD 北大核心 2012年第3期115-117,共3页 Computer Applications and Software
基金 国家自然科学基金项目(60773034)
关键词 话题关联检测 话题检测与跟踪 动态共现 归一化检测开销 Topic detection and tracking Dynamic co-occurrence Normalized detection cost
  • 相关文献

参考文献6

  • 1洪宇,张宇,范基礼,刘挺,李生.基于语义域语言模型的中文话题关联检测[J].软件学报,2008,19(9):2265-2275. 被引量:19
  • 2Allan J,Lavrenko V,Malin D,et al.Detections,Bounds,and Timelines:Umass and tdt-3[C]//Proceedings of Topic Detection and Tracking(TDT-3),2000:167-174.
  • 3Chen Y J,Chen H H.NLP and IR Approaches to Monolingual and Mul-tilingual Link Detection[C]//Proceedings of the19th International Conference on Computational Linguistics(COLING2002),Taipei,Tai-wan,2002:1-7.
  • 4Brown R D.Dynamic Stopwording for Story Link Detection[C]//Pro-ceedings of Second International Conference on Human Language Tech-nology Research.San Diego,California,2002:190-193.
  • 5Chirag Shah,Koji Eguchi.Improving Document Representation for Story Link Detection by Modeling Term Topicality[J].Information and Media Technologies,2009,4(2):433-441.
  • 6赵华,赵铁军,于浩,张姝.面向动态演化的话题检测研究[J].高技术通讯,2006,16(12):1230-1235. 被引量:17

二级参考文献9

  • 1贾自艳,何清,张海俊,李嘉佑,史忠植.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280. 被引量:58
  • 2于满泉,骆卫华,许洪波,白硕.话题识别与跟踪中的层次化话题识别技术研究[J].计算机研究与发展,2006,43(3):489-495. 被引量:49
  • 3The 2003 topic detection and tracking task definition and evaluation plan.http://www.nist.gov/speech/tests/tdt/tdt2003/evalplan.htm,April,2003
  • 4Makkonen J.Investigations on event evolution in TDT.In:Proceedings of Student Workshop of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics,Edmonton,Canada.2003,43-48
  • 5Nallapati R,Feng A,Peng F C.Event threading within news topics.In:Proceedings of International Conference on Information and Knowledge Management,Washington 2004,446-453
  • 6Sudipto G,Rajeev R,Kyuseok S.CURE:an efficient clustering algorithm for large databases.In:Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle.1998,73-84
  • 7Papka R.On-line new event detection,clustering,and tracking:[PhD thesis].Department of Computer Science,University of Massachusetts,1999
  • 8Allan J,Papka R,Lavrenko V.On-line new event detection and tracking.In:Proceedings of the 21st ACM-SIGIR International Conference on Research and Development in Information Retrieval,Australia.August 1998,37-45
  • 9吴平博,陈群秀,马亮.基于事件框架的事件相关文档的智能检索研究[J].中文信息学报,2003,17(6):25-30. 被引量:30

共引文献33

同被引文献19

  • 1洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述[J].中文信息学报,2007,21(6):71-87. 被引量:153
  • 2ALLAN J, LAVRENKO V, MALIN D, et al. Detections, bounds and timelines: UMASS and TDT-3 [ C ] //Proceedings of Topic Detection and Tracking (TDT-3). Vienna: [s. n. ], 2000: 167-174.
  • 3KUMARAN G, ALLAN J. Text classification and named entities for new event detection[ C ]//Proc. of the SIGIR 2004. New York: Association for Computing Machinery Press, 2004: 297-304.
  • 4CHEN Y J, CHEN H H, NLP I R. Approaches to monolingual and multilingual link detection[C]// Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Taipei: Association for Computational Linguistics, 2002 : 1-7.
  • 5SHAH C, EGUCHI K. Improving document representation for story link detection by modeling term topicality[J]. Information and Media Technologies, 2009, 4(2) : 433-441.
  • 6DAGAN I, MARCUS S, MARKOVITCH S. Contextual word similarity and estimation from sparse data[C]// Proceedings of the 31st Annual Meeting on Association for Computational Linguistics. Morristown : Association for Computational Linguistics, 1993 : 164-171.
  • 7CHEN P I, LIN S J. Word Ad-Hoc network: using Google core distance to extract the most relevant information[ J ]. Knowledge-Based Systems, 2011, 24 : 393-405.
  • 8PAN Y, LUO H X, TANG Y, et al. Learning to rank with document ranks and scores [ J ]. Knowledge-Based Systems, 2011, 24: 478-483.
  • 9BURGESS C, LIVESAY K, LUND K. Explorations in context space: words, sentences, discourse~ J ]. Discourse Processes, 1998, 25(2/3) : 211-257.
  • 10SONG D, BRUZA P D. Towards context sensitive information inference[J]. Journal of the American Society for Information Science and Technology, 2003, 54(4) : 321-334.

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部