摘要
译文选择是词义消歧研究在机器翻译中的分支.本文介绍了基于目标语统计的译文选择的原理,并以英汉机器翻译中汉语译文的选择为例,详细讨论了基于目标语统计的方法的实现.通过对词典译文的处理得到统计数据.为适应实用化系统的要求,统计数据进行压缩,并采用特殊的检索算法.提出了多项式级的逐步渗透译文选择算法,实验结果表明,该算法可以将译文选择正确率提高10%以上.
Translation Selection(TS) is one type of research on Word Sense Disambiguation (WSD). This paper introduces the method of TS based on the statistics of target language words and its implementation in an English Chinese machine translation system. Reasonable Statistical Data Base(SDB) should be created by selecting singular Chinese words segmented from translations in the dictionary of an ECMT system. Huge statistical data must be compressed and special retrieval algorithm should be adopted for the sake of practical MT system. The paper puts forward a TS algorithm based on co occurrence frequency which adopts Greedy strategy and Dynamic Programming method. The precision of translation selection in the ECMT system arise 10% by using the algorithm in open tests.
出处
《应用基础与工程科学学报》
EI
CSCD
1999年第1期107-116,共10页
Journal of Basic Science and Engineering
基金
自然科学基金
关键词
译文选择
词义消歧
统计
目标语
机器翻译
translation selection, word sense disambiguation, statistics, target language, machine translation