摘要
更新摘要除了要解决传统的面向话题的多文档摘要的两个要求——话题相关性和信息多样性,还要求应对用户对信息新颖性的需求.文中为更新摘要提出一种基于热传导模型的抽取式摘要算法——HeatSum.该方法能够自然利用句子与话题,新句子和旧句子,以及已选句子和待选句子之间的关系,并且为更新摘要找出话题相关、信息多样且内容新颖的句子.实验结果表明,HeatSum与参加TAC09评测的表现最好的抽取式方法性能相当,且更优于其它基准方法.
Besides the problems of topic relevance and information diversity tackled by traditional topic-focused multi-document summarization, the update summarization is required to address the problem of information novelty as well. In this paper, HeatSum, an extractive approach based on heat conduction for update summarization, is proposed. The process can naturally make use of the relationships among the given topic, the old sentences, the new sentences, and the sentences selected and to be selected to find proper sentences for update summarization. Therefore, HeatSum is able to simultaneously address the challenging problems above for update summarization in a unified way. The experiments on benchmark of TAC 2009 are performed and the ROUGE evaluation results show that the HeatSum achieves fine performance compared to the best existing performing systems in TAC tasks and it significantly outperforms other baseline methods.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2012年第3期367-374,共8页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金重点项目(No.60933005)
国家自然科学基金项目(No.60903139
61003166)
国家863计划项目(No.2010AA012500)资助
关键词
更新摘要
面向话题的多文档摘要
热传导模型
Update Summarization, Topic-Oriented Multi-Document Summarization, Heat Conduction Model