摘要
统计机器翻译从诞生至今获得了长足的发展,目前已经成为机器翻译的主流。但是作为基础模块之一的翻译模型却随训练语料的增大而呈现飞速增大的趋势。为了使统计机器翻译更加实用,翻译模型的约简一直是研究热点之一。概述了统计机器翻译中翻译模型约简的研究现状,相关方法主要围绕解码过程统计分析、训练语料中的统计分析、翻译模型中的短语对自身特点分析等三个类别。结合相关分析,最后也探讨了这个方向的未来发展趋势。
Statistical machine translation has achieved great development from its birth so far. It has been the main stream of machine translation. However, as one of the base module, translation model will present the rapidly increasing trend with the training corpus increasing. In order to make statistical machine translation more practical, the pruning for translation model is always one of the research hot topics. This paper surveys the state-of-the-art of translation model pruning technologies for statistical machine translation. The related pruning approaches are divided as three cate- gories: statistical analysis for decoding process, statistical analysis for parallel training corpus and the internal alignment characters of translation phrase pair. With the related analysis, this paper finally points out some future directions of this topic.
作者
郎君
LANG Jun (Institute for Infocomm Research, Singapore 138632)
出处
《智能计算机与应用》
2011年第1X期13-16,共4页
Intelligent Computer and Applications