摘要
本文提出了基于段落匹配和分布密度的偏重文本摘要实现机制,旨在满足摘要的个性化要求。首先在关键字同义扩充的基础上,利用基于侧面相似度的段落匹配方法,获取相关的文本段落集合。然后通过计算文本窗口的分布密度函数,获取关键字集聚区域,依据覆盖区域的句子权重,输出的最终偏重摘要。最后进行了评价实验,通过问答测验和相似比较,效果良好,而且表明偏重摘要对于多主题文本更为有效。
There is an important issue that text summarization has to embody the personal information need and provide the indicative message for user. In this paper, a mechanism of query-biased summarization is presented based on passage matching and density distribution. First, each keyword and its synonymies are regarded as a query profile, and then the relevant passages are retrieved by profile matching. The density of term in these passages is calculated by Hanning window fuhction, and the centralizing areas of keywords are acquired. Considering the density distribution and the number of keywords included, the important sentences are extracted as the final output query-biased summarization. The evaluations were made through Question and Answering test and similarity comparison, and it showed that our mechanism improved the ability to meet personal information need and illustrated more effective on multi-theme texts.
出处
《中文信息学报》
CSCD
北大核心
2007年第1期43-48,共6页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(6037309560673039)
关键词
计算机应用
中文信息处理
文本摘要
偏重摘要
同义扩充
段落匹配
分布密度
computer application
Chinese information processing
text summarization
query-biased summarization
synonymous expansion
passage match
density distribution