期刊文献+

Computation on Sentence Semantic Distance for Novelty Detection 被引量:2

Computation on Sentence Semantic Distance for Novelty Detection
原文传递
导出
摘要 Novelty detection is to retrieve new information and filter redundancy fromgiven sentences that are relevant to a specific topic. In TREC2003, the authors tried an approach tonovelty detection with semantic distance computation. The motivation is to expand a sentence byintroducing semantic information. Computation on semantic distance between sentences incorporatesWordNet with statistical information. The novelty detection is treated as a binary classificationproblem: new sentence or not. The feature vector, used in the vector space model for classification,consists of various factors, including the semantic distance from the sentence to the topic and thedistance from the sentence to the previous relevant context occurring before it. New sentences arethen detected with Winnow and support vector machine classifiers, respectively. Several experimentsare conducted to survey the relationship between different factors and performance. It is provedthat semantic computation is promising in novelty detection. The ratio of new sentence size torelevant size is further studied given different relevant document sizes. It is found that the ratioreduced with a certain speed (about 0.86). Then another group of experiments is performedsupervised with the ratio. It is demonstrated that the ratio is helpful to improve the noveltydetection performance. Novelty detection is to retrieve new information and filter redundancy fromgiven sentences that are relevant to a specific topic. In TREC2003, the authors tried an approach tonovelty detection with semantic distance computation. The motivation is to expand a sentence byintroducing semantic information. Computation on semantic distance between sentences incorporatesWordNet with statistical information. The novelty detection is treated as a binary classificationproblem: new sentence or not. The feature vector, used in the vector space model for classification,consists of various factors, including the semantic distance from the sentence to the topic and thedistance from the sentence to the previous relevant context occurring before it. New sentences arethen detected with Winnow and support vector machine classifiers, respectively. Several experimentsare conducted to survey the relationship between different factors and performance. It is provedthat semantic computation is promising in novelty detection. The ratio of new sentence size torelevant size is further studied given different relevant document sizes. It is found that the ratioreduced with a certain speed (about 0.86). Then another group of experiments is performedsupervised with the ratio. It is demonstrated that the ratio is helpful to improve the noveltydetection performance.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2005年第3期331-337,共7页 计算机科学技术学报(英文版)
基金 国家重点基础研究发展计划(973计划),中国网络计算屯与信息安全管理中心项目
关键词 novelty detection sentence semantic distance CATEGORIZATION novelty detection sentence semantic distance categorization
  • 相关文献

参考文献20

  • 1Ian Soboroff,Donna Harman.Overview of the TREC 2003 Novelty track. In Proc.the Twelfth Text Retrieval Conference, Gaithersburg,Maryland,November 18-21,2003,p.38.
  • 2Zhang M,Song RfLin Cvet al.Expanslon-based technologies in finding relevant and new information:THU TREC2002 novelty track experiments. In Proc.the Eleventh Text Retrieval Conference,Gaithersburg, Maryland,November 19-22,2002,p.591.
  • 3Christof Monz,Jaap Kamps,Maarten de Rijke,The University of Amsterdam at TREC2002.In Proc.the Eleventh Text Retrieval Conference, Gaithersburg,Maryland,November 19-22,2002,p.603.
  • 4Leah S,James Allen,Magaret E,Alvaro B,Courtey W.UMass at TREC2002:Cross language and novelty tracks.In Proc.the Eleventh Text Retrieval Conference,Gaithersburg,Maryland,November 19-22,2002,p.721.
  • 5Hong Qi,Jahna O,Dragomir R,The University of Michigan at TREC2002:Question answering and novelty tracks.In Proc,the Eleventh Text Retrieval Conference,Gaithersburg,Maryland,November 19-22,2002,p.733.
  • 6Srikanth K,Yongmei S et al.UMBC at TREC12.In Proc.the Twelfth Text Retrieval Conference,Gaithersburg,Maryland,November 18-21,2003,p.699
  • 7Ganesh R,Kedar B,Chirag Shah,Deepa P.Generic text summarization using Wordnet for novelty and hard.In Proc.the Twelfth Text Retrieval Conference,Gaithersburg,Maryland,November 18-21,2003,p.303.
  • 8Ryosuke Ohgaya,Akiyoshi Shimmura,Tomohiro Takagi.Meiji University Web and Novelty Track Experiments at TREC2003.In Proc.the Twelfth Text Retrieval Conference,Gaithersburg,Maryland, November 18-21,2003,p.399.
  • 9Jian Sun,Wenfeng Pan, Huaping Zhang.TREC2003 novelty and web track at ICT. In Proc.the Twelfth Text Retrieval Conference, Gaithersburg, Maryland, Nov.18-21,2003,p.138.
  • 10Taoufiq D,Josiane M.TREC novelty track at IRIT-SIG.In Proc.the Twelfth Text Retrieval Conference,Gaithersburg,Maryland.November 18-21,2003,p.337.

同被引文献28

引证文献2

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部