摘要
向量空间模型常被用于计算两个句子的相似度,通过将两个句子转换为词项向量,然后计算两个向量的夹角余弦值,就可以得出两个句子的相似度分数。传统的向量空间模型没有考虑到句子中词语之间的相似度,这就使使用近义词的两个语义相近句子得到的相似度分数较低。提出了一种带有词义特征的向量空间模型,在传统的向量空间模型中引入词语之间的相似度,从而使计算出的两个句子的相似度分数更加准确。
Vector Space Model is often used for calculating the similarity of two sentences. By converting two sentences into term vectors, we can get two sentences' similarity by calculating the cosine of two vectors. Traditional Vector Space Model doesn't consider the similarity of words in two sentences, so if using this model to calculate two sentences with near-synonym, it will get a low similarity score. This paper proposes a new Vector Space Model with semantic of word, adds similarity of words to traditional Vector Space Model, then makes the result similarity score more accurate.
出处
《成都信息工程学院学报》
2012年第3期239-242,共4页
Journal of Chengdu University of Information Technology