期刊文献+

A New Approach for Multi-Document Update Summarization 被引量:2

A New Approach for Multi-Document Update Summarization
原文传递
导出
摘要 Fast changing knowledge on the Internet can be acquired more efficiently with the help of automatic document summarization and updating techniques. This paper describes a novel approach for multi-document update summarization. The best summary is defined to be the one which has the minimum information distance to the entire document set. The best update summary has the minimum conditional information distance to a document cluster given that a prior document cluster has already been read. Experiments on the DUC/TAC 2007 to 2009 datasets (http://duc.nist.gov/, http://www.nist.gov/tac/) have proved that our method closely correlates with the human summaries and outperforms other programs such as LexRank in many categories under the ROUGE evaluation criterion. Fast changing knowledge on the Internet can be acquired more efficiently with the help of automatic document summarization and updating techniques. This paper describes a novel approach for multi-document update summarization. The best summary is defined to be the one which has the minimum information distance to the entire document set. The best update summary has the minimum conditional information distance to a document cluster given that a prior document cluster has already been read. Experiments on the DUC/TAC 2007 to 2009 datasets (http://duc.nist.gov/, http://www.nist.gov/tac/) have proved that our method closely correlates with the human summaries and outperforms other programs such as LexRank in many categories under the ROUGE evaluation criterion.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期739-749,共11页 计算机科学技术学报(英文版)
基金 supported by the National Natural Science Foundation of China under Grant No.60973104 the National Basic Research 973 Program of China under Grant No.2007CB311003 the IRCI Project from IDRC,Canada
关键词 data mining text mining Kolmogorov complexity information distance data mining, text mining, Kolmogorov complexity, information distance
  • 相关文献

参考文献25

  • 1Luhn H P. The automatic creation of literature abstracts. IBM Journal of Research and Development, 1958, 2(2): 159- 165.
  • 2Wan X, Yang J, Xiao J. Manifold-ranking based topic-focused multi-document summarization. In Proc IJCAI, Hyderabad, India, Jan. 6-12, 2007, pp.2903-2908.
  • 3Li M, Vitanyi P M. An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, 1997.
  • 4Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. SIGIR, Melbourne, Australia, Aug. 24-28, 1998, pp.335-336.
  • 5Radev D R, Jing H, Stys M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919-938.
  • 6Kupiec J, Pedersen J, Chen F. A trainable document summarizer. In Proc. SIGIR, Seattle, USA, Jul. 9-13, 1995, pp.68- 73.
  • 7Leskovec J, Milic-Frayling N, Grobelnik M. Impact of linguistic analysis on the semantic graph coverage and learning of document extracts. In Proc. AAAI, Pittsburgh, USA, Jul. 9- 13, 2005, pp.1069-1074.
  • 8Shen D, Sun J T, Li H, Yang Q, Chen Z. Document summarization using conditional random fields. In Proc. IJCAI, Hyderabad, India, Jan. 6-12, 2007, pp.2862-2867.
  • 9Zhang J, Cheng X, Wu G, Xu H. Adasum: An adaptive model for summarization. In Proc. CIKM, Napa Valley, USA, Oct. 26-30, 2008, pp.901-909.
  • 10Erkan G, Radev D R. Lexpagerank: Prestige in multidocument text summarization. In Proc. EMNLP, Barcelona, Spain, Jul. 25-26, 2004, pp.365-371.

同被引文献7

引证文献2

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部