摘要
针对词语向量化表示的问题,根据词语词向量表示的思想以及借助多义词词典,在K-means聚类多义词语上下文表示的基础上,获得词语的多原型向量表示。对句子中的多义词语,通过计算词语多原型向量表示与词语上下文表示的相似度来进行词义消歧,根据2个句子集中共有词语和差异词语的词义相似度,给出一种基于词语多原型向量表示的句子相似度计算方法,实验结果显示了该方法的有效性。
In view of vectorized representation of word,according to the idea of Word Embedding as well as the use of external polysemy dictionary,on the basis of polysemous words context representation based on K-means clustering algorithm, the paper presents a method for obtaining a word ' s multi-prototype vector representation. Word sense disambiguation is performed on polysemous words in sentences by calculating the similarity between the word multi-prototype vector representation and the words context representation. According to the semantic similarity of the common words and the difference words in the two sentence sets,a sentence similarity computation method based on multi-prototype vector representation is given. The experimental results show the effectiveness of the method.
作者
郭鸿奇
李国佳
GUO Hongqi;LI Guojia(School of Electric Power, North China University of Water Resources and Electric Power, Zhengzhou 450045, China;School of Software, North China University of Water Resources and Electric Power, Zhengzhou 450045, China)
出处
《智能计算机与应用》
2018年第2期38-42,共5页
Intelligent Computer and Applications
基金
华北水利水电大学2017年创新创业计划项目(2017XB136)
关键词
词语多原型向量表示
词义消歧
句子相似度
multi-prototype vector representation
word sense disambiguation
sentence similarity