摘要
词义排歧方法的研究在自然语言处理领域具有重要的理论和实践意义。研究了一种基于知网的语义剪枝算法,来解决自然语言处理过程中的词义排歧问题。其目的是通过语义剪枝系统尽可能地减少歧义词在上下文中错误的或最不可能的义项。语义剪枝以后,形成词和其可能义项的一个列表,尽量将一个词真正正确的义项保留下来。为了对语义剪枝算法进行评价,开发了一个手工标注交互环境,并使用了召回率和简化率2个指标。对窗口的尺寸和分析单元的选取对召回率和简化率的影响进行了研究。
Word sense disambiguation is one of the first problems in natural language processing system so far. It trys to solve the prob- lem of word sense disambiguation in natural language processing by Sense Pruning using HowNet. We proposes that the objective of WSD is to reduce the number of plausible meanings of a word as much as possible through "Sense Pruning". After Sense Pruning, it will associate a word with a list of plausible meanings. It would like to keep the truly correct sense of each word on its own meaning list. Developing a human-machine mutual word sense tagging system and two set of criteria were used for the evaluation of Sense Pruning al- gorithm : recall rate and reduction of the number of possible meanings of a sentence. Effects of the size of the analytical window and the analytical unit were studied.
出处
《控制工程》
CSCD
北大核心
2013年第5期887-890,共4页
Control Engineering of China
基金
内蒙古自然科学基金资助(2009MS0106)
内蒙古自然科学基金资助(2013MS0102)
关键词
词义排歧
自然语言处理
知网
语义剪枝
word sense disambiguation (WSD)
natural language processing (NLP)
HowNet
sense pruning