摘要
本文提出了一种概念同现模型的多文档自动文摘方法。该方法使用HowNet进行概念获取,建立概念向量空间模型,利用词汇的吸引与排斥现象和概念同现频率建立概念同现模型,并使用概念同现模型计算各概念的权重,利用建立的概念向量空间模型计算句子权重,根据句子权重和相似度情况产生文摘。使用改进的ROUGE-N评测方法、主题词覆盖(TWC)、高频词覆盖率(HFWC)作为评测指标对系统产生的文摘进行评测,结果显示这些方法是有效的。
In this paper,we propose a multi-document summarization method based on the concept co-occurrence model.The method uses HowNet to obtain the concept of word,constructing a concept vector sapce model(CVSM);uses the concept co-occurrence frequency and lexical attraction and repulsion model to construct the concept co-occurrence model;uses the concept co-occurrence model and CVSM to compute the weight of concept sentences.According to the weight and similarity of sentences to extract summary,the experimental results show the system has more effectiveness and feasibility.
出处
《计算机工程与科学》
CSCD
北大核心
2011年第7期188-192,共5页
Computer Engineering & Science
基金
国家863计划资助项目(2009AA04Z146)
国家自然科学基金资助项目(90920005)
广西教育厅项目(200808LX338
200808LX341)
福建省教育厅B类项目(JB09054)