摘要
词义消歧仍然是自然语言处理中一个重大的挑战,在自然语言处理的一开始,词义消歧就被认为是自然语言处理的中心任务之一。这篇文章提出了一种无导词义消歧的方法,该方法采用二阶context构造上下文向量,使用k-means算法进行聚类,最后通过计算相似度来进行词义的排歧.实验是在抽取术语的基础上进行的,在多个汉语高频多义词的两组测试中取得了平均准确率82.67%和84.55%的较好的效果。
Word sense disambiguation is still as a considered one of the most challenging problems in natural language process- ing. Ever since the field' s inception WSD has been perceived as one of the central problems in NLP. This paper presents an unsu- pervised approach which constructs context vector by means of second-order context, clustering by k-means and disambiguates by calculating the similarity. Our experiments are based on the extraction of term and average accuracy is 82.62% and 84.55% for 7 ambiguous words in open test by this method.
作者
陈浩
CHEN Hao(Department of Computer Science, Guangdong University of Finace and Economics&, Huashang College, Guangzhou 510000, China)
出处
《电脑知识与技术》
2015年第4期67-68,71,共3页
Computer Knowledge and Technology