摘要
本文针对多文档摘要没有考虑实体、仅仅生成通用摘要的问题,提出面向实体的演化式多文档摘要生成方法。本文首先利用一个概率主题模型联合建模文档主题的演化和实体的参与情况,然后结合实体对句子进行评分和选择,针对不同的实体,同一个句子可能获得不同的评分。此外,本文在真实数据集上进行了大量的实验和分析,实验结果表明,该方法可以面向不同的实体生成关于事件发展的个性化摘要,同时与现有方法相比,该方法还得到了更好的通用摘要。
The objective of this paper is to propose a novel entity-oriented timeline summarization from multiple documents. To achieve this, this paper firstly proposes a topic model to simultaneously model the dynamic topics and the entity's participation. An efficient Gibbs sampler is also developed for this model. Then each sentence is allocated a score based on the discovered topics and the sentences with high score are selected as summaries. Experimental results on real-world datasets verify that the proposed model can not only generate summaries for entities, but also outperform the baseline model on Rouge evaluation.
出处
《广西师范大学学报(自然科学版)》
CAS
北大核心
2015年第2期36-41,共6页
Journal of Guangxi Normal University:Natural Science Edition
基金
"863"国家重大课题资助项目(2014AA7013033
2014AA7115061
2014AA7115028)
关键词
多文档摘要
概率主题模型
自然语言处理
multiple document summarization
topic model
natural language process