摘要
从为语料库标注多义词词义的实践来看,词典普遍存在词义可区分性不足的情况。根据对《现代汉语词典》的分析,本文认为词典中多义词的义项之间存在重叠、相离、包含等关系,这些关系对词义的准确区分带来不利影响,具体表现为词义区分线索不足、义项缺失等形式,降低了词义区分的准确率和可操作性。本文结合词义标注语料数据对这些表现分别进行了分析,指出厘清多义词义项之间的关系、改善词义可区分性能够提高词义消歧的准确率,同时有助于提高词典编纂的质量。
The study of word sense tagging for polysemes in the corpora reveals a lack of semantic distinction in the dictionaries. A case study of Contemporary Chinese Dictionary reveals that the polysemes have the features of overlapping, separation, inclusiveness and others, which hinder the accurate distinction of the meanings and result in much ambiguity. The paper uses the corpora of word sense tagging to analyze the sense relations and tries to eliminate the ambiguity of meanings in hope of compiling better dictionaries.
出处
《云南师范大学学报(哲学社会科学版)》
CSSCI
2010年第1期41-46,共6页
Journal of Yunnan Normal University:Humanities and Social Sciences Edition
关键词
义项划分
词义区分
词义标注
多义词
语料库
sense division
sense distinction
word sense tagging
polyseme
corpus