摘要
本文充分利用当前HowNet资源中概念的可计算性和句子对齐的汉英双语平行语料库信息,将词义排歧的问题转化为两种语言相对应句子词义组合的相似度计算问题,进而利用动态规划法的思想设计出一种在一定的时间复杂度内,有效的标出多义词义项的算法。该方法从以前对每个多义词进行排歧时只考察其上下文环境和对应信息,改变到对句子中所有的词同时考察上下文环境,这样就可以站在句子高度来进行词义标注,最终取得了满意的实验结果。
Taking full advantage of the computability of the concept in the HowNet, this paper changes word sense tagging in Chinese - English parallel corpora into the similarity calculation between the concept combinations of the aligned sentences of the two languages. At the same time, the dynamic planning thought is used in order to reduce the time complexity of the algorithm. The current word sense tagging method in parallel corpora only used the context of the single ambiguous word and alignment information, but this method can take into account the all words' context in the aligned sentences together. In this way it can settle the problem from the viewpoint of the whole sentence and achieve the satisfactory result.
出处
《中文信息学报》
CSCD
北大核心
2005年第6期50-56,共7页
Journal of Chinese Information Processing
关键词
人工智能
自然语言处理
词义排歧
HOWNET
双语平行语料库
artificial intelligence
natural language processing
word sense disambiguation
HowNet
parallel corpora