摘要
词语的语义计算是自然语言处理领域的重要问题之一,目前的研究主要集中在词语语义的相似度计算方面,对词语语义的相关度计算方法研究不够.为此,本文提出了一种基于语义词典和语料库相结合的词语语义相关度计算模型.首先,以HowNet和大规模语料库为基础,制定了相关的语义关系提取规则,抽取了大量的语义依存关系;然后,以语义关系三元组为存储形式,构建了语义关系图;最后,采用图论的相关理论,对语义关系图中的语义关系进行处理,设计了一个基于语义关系图的词语语义相关度计算模型.实验结果表明,本文提出的模型在词语语义相关度计算方面具有较好的效果,在Word Similarity-353数据集上的斯皮尔曼等级相关系数达到了0.5358,显著地提升了中文词语语义相关度的计算效果.
Word semantic computation is one of the important issues in nature language processing. Current studies usually focus on semantic similarity computation of words, not paying enough attention to the semantic relatedness computation. For this reason, we present a word semantic relatedness calculation model based on semantic dictionary and corpus. First of all, the semantic extraction rules are formulated with "HowNet" and corpus, and a large number of semantic dependency relations are extracted based on these rules. Then, a semantic relationship graph is constructed by storing the semantic relationship triplet tuple. At last, graph theory is used to process the semantic relation in the semantic relationship graph and a semantic relatedness calculation model is designed by means of the semantic relationship graph. Experimental results show that this method has a better performance in word semantic relatedness computation, the Spearman rank correlation on the WordSimilarity-353 dataset being up to 0.5358, a significant efficiency improvement of semantic relatedness computation of Chinese words.
出处
《自动化学报》
EI
CSCD
北大核心
2018年第1期87-98,共12页
Acta Automatica Sinica
基金
国家自然科学基金(61370139
61602044)资助~~
关键词
语义相关度
语义关系图
HOWNET
依存语义关系
语义相似度
Semantic relatedness, semantic relationship graph, HowNet, dependency semantic relation, semantic similarity