摘要
提出了一种基于不同语义单元度量的句子相似度计算方法.将句子按词块分割为对应的公共词块和非公共词块,利用外部语义资源进行同义词替换和语义消歧处理.分别用词、词块和字为语义单元度量句子相似度,以不同的权重调节各语义单元对句子相似度的贡献.实验结果表明,该方法综合考虑的因素更加全面,有较高的准确率.
A method of sentence similarity computing based on different semantic units was proposed .A sentence can be divided into corresponding public word blocks and non-public word blocks according to word blocks , and then synonym substitution and semantic disambiguation processing can be carried by using external semantic resource . Words, word blocks and characters were used as the semantic units to measure the sentence similarity and adjust the contribution of each semantic unit to the sentence similarity with different weights .The experimental results showed that this approach of overall evaluation factor was more comprehensive and higher accuracy can be achieved .
出处
《信阳师范学院学报(自然科学版)》
CAS
北大核心
2014年第1期145-148,共4页
Journal of Xinyang Normal University(Natural Science Edition)
基金
贵州省优秀科技教育人才省长专项资金项目(黔省专合字(2012)82)
关键词
句子相似度
词块
公共词块
同义词词林
搭配词库
sentence similarity
word block
common word block
tongyici Cilin
collocation dictionary