摘要
在利用大规模英汉双语平行语料库进行双向双语翻译词典建设时发现:由于错误累计问题.现有词对齐技术无法直接获取质量较高的双语词汇知识.由此提出一种基于HowNet以及WordNet进行相似度计算,然后设定相似度阈值来进行词义过滤的方法.实验结果表明该方法行之有效.并对HowNet以及WordNet相似度计算方法进行了基于实际应用的对比与探讨后得出:HowNet在语义区分上粒度更细因此其召回率较高,WordNet则具有更高的精确率.
While using a large-scale bilingual English-Chinese corpus to build translation dictionary, after some statistics and analysis, it is found that there are some unconquerable error accumulation problems while acquiring bilingual lexical knowledge by using large-scale bilingual corpus. Furthermore, a method is raised to solve this problem using semantic dictionary and its similarity measurement, Primary experiment has indicated that this method is effective and feasible. The application-oriented comparison between HowNet and WordNet has been made in this paper, and a conclusion is drawn: HowNet has higher recall while WordNet has higher precision for their difference of semantic granularity.
出处
《哈尔滨工程大学学报》
EI
CAS
CSCD
北大核心
2006年第B07期575-579,共5页
Journal of Harbin Engineering University
基金
国家自然科学基金资助项目(60375019).