摘要
在机器翻译研究领域中 ,评测工作发挥着重要的作用 ,它不仅仅是简单地对各个系统输出结果进行比较 ,它还对关键技术的发展起到了促进作用。译文质量的评测工作长期以来一直以人工的方式进行。随着机器翻译研究发展的需要 ,自动的译文评测研究已经成为机器翻译研究中的一个重要课题。本文讨论了基于n gram共现的自动机器翻译评测框架 ,介绍了BLEU、NIST、OpenE三种自动评价方法 ,并通过实验详细分析了三种方法的优缺点。其中的OpenE采用了本文提出了一种新的片断信息量计算方法。它有效地利用了一个局部语料库 (参考译文库 )和全局语料库 (目标语句子库 )。实验结果表明这种方法对于机器翻译评价来说是比较有效的。
Evaluations are very helpful for the research of Machine Translation (MT). The aim of evaluations is not only to output the differences among MT systems, but also to stimulate the improvement of key technologies in this area. In the past, the evaluations of MT are performed by human. With the increasing needs of MT research, the automatization of MT evaluations becomes more and more important. This paper introduces the basic framework of automatic MT evaluation using n-gram co-occurrence statistics. Three methods (BLEU, NIST and OpenE) based on this framework are described. The advantages and disadvantages of these methods are also discussed through the analysis of several experiments. Among these methods, OpenE adopts a new method of n-gram weighting which employs a local corpus and a large global corpus. Through the experiments, this method is proved to be practical for machine translation evaluation.
出处
《中文信息学报》
CSCD
北大核心
2004年第2期15-22,共8页
Journal of Chinese Information Processing
基金
国家重点基础研究资助项目 (G19980 30 5 0 11)