摘要
计算语句的类似度在自动问答、机器翻译、信息检索和自动摘要等算法中有着非常重要的作用。首先归纳了语句类似度计算的方法,综合考虑关键词特征、语义特征、句式特征和语句长度特征等信息并提出一种优化语句类似度计算方法,以语句权重计算方法以及冗余处理优化处理为基础,实现一个改进的自动摘要算法。通过在DUC的测评语料上进行仿真,实验结果证明了该算法对于摘要质量优化的高效性。最后,讨论了自动摘要研究存在的问题,并指出自动摘要的研究趋势。
Calculating the sentence similarity plays an important role in algorithms of automatic question-answering, machine-translation, information retrieval and automatic abstracting, etc. In this article, firstly we sum up the methods of calculating the sentences similarity, and then bring forward a new method of optimising the sentences similarity calculation by synthetically taking into consideration the information in- eluding key words characters, semantic characters, sentential form characters and sentence length characters, etc. Moreover, we implement an improved automatic abstracting algorithm on the basis of sentences weight computation and redundancy resolution optimisation. Through the simulation on DUC evaluation corpuses, the experimental result proves the efficiency of this algorithm in abstracting quality optimisation. In end of the paper, we discuss the problems'existed in automatic abstracting research and point out the studying trend of this technology as well.
出处
《计算机应用与软件》
CSCD
北大核心
2013年第9期160-162,182,共4页
Computer Applications and Software
基金
河南省科技厅基础与前沿技术研究类重点项目(112300410266)
河南省科技厅基础与前沿技术研究类重点项目(112300410262)
关键词
语句类似度
自动摘要
语句权重计算
冗余处理
Sentence similarity Automatic abstracting Sentence-weight computing Redundancy resolution