摘要
词类信息对词义的自动识别与标注有重要的意义。现代汉语词典(第五版)为例词标注了词类信息,并分词类立义项,这为通过词类区分词义提供了便利。本文通过对词典中多义项词的词类情况进行分析,探讨了通过词类区分词义所能取得的作用和效果。词类区分词义的作用通过使用词类区分词义和语素义、使用词类缩小词义判断的范围表现出来。词典中,一部分多义词可以通过词类确定唯一义项,一部分词可以通过词类减少义项的数量。本文结合现代汉语词典和词义标注语料数据,对这一问题进行了量化分析,指出了通过词类区分多义词词义所能达到的效果。
This study aims to conduct a quantitative analysis of using part-of-speech attributes of a word to distinguish word sense in the process of word sense disambiguation. This is a case study of Contemporary Chinese Dictionary (fifth edition) in which the words are attached POS tags, and each POS of a word is defined with a separate sense which means that polysemous words with several POS attributes have the same meanings in the dictionary. This feature contributes greatly to word sense disambiguation. The paper firstly analyses the effect of using POS to distinguish word sense and morpheme. Next, the paper analyses the proportion of conversion words and polysemants in both the dictionary and the corpora. Finally, the paper gives the probability on reducing the complexity of distinguishing word sense in the corpora.
出处
《云南师范大学学报(哲学社会科学版)》
CSSCI
2010年第1期47-52,共6页
Journal of Yunnan Normal University:Humanities and Social Sciences Edition
关键词
词类
词义
词义消歧
消歧线索
现代汉语词典
part of speech
word sense
sense distinction
word sense disambiguation
Contemporary Chinese Dictionary