摘要
句子相似度计算是自然语言处理领域的关键问题,计算句子相似度的方法也有很多。本文针对基于多特征句子相似度计算模型对计算句子相似度结果偏低这一问题进行研究,在词语语义的基础上增加相似词计算,同时增加句子成分关系相似度计算方法,该改进方法既避免了增加额外同义词词典的操作,又充分考虑句子的词形、句长、词序、语义、成分关系等多特征信息,提高了句子相似度的计算结果。实验结果表明,该方法对句子相似度计算有一定的提高,且该方法合理、简便、可行。
Sentence similarity calculation is a key issue in the field of natural language processing. There are many methods to calculate sentence similarity. We research the problem of low calculation results of sentence similarity calculation model based on multi-features. On the basis of word semantic similarity,the paper adds similar word calculation method,at the same time,adds the similarity calculation method of the sentence constituents' relationship. The improved method not only avoids the operation of the additional synonyms dictionary,but also fully considers the words in the sentence,sentence length,word order,semantic,the relationship of sentence constituents. The method improves the sentence similarity calculation results. Experimental results show that the method can improve the results of sentence similarity calculation and the method is reasonable,simple and feasible.
出处
《计算机与现代化》
2015年第7期31-33,39,共4页
Computer and Modernization
基金
国家自然科学基金资助项目(61272500)
关键词
句子相似度
相似词
成分关系
多特征
sentence similarity
similar word
constituent's relationship
multi-features