期刊文献+

多引擎机器翻译译文重排序与融合研究

A Study of Re-ranking and Combination for Multi-engine Machine Translation
下载PDF
导出
摘要 [目的/意义]使用不用的模型、方法、语种、数据构建的机器翻译引擎往往在不同的场景下具有不同的翻译效果。因此,很多研究者都在构建机器翻译引擎时尝试使用多引擎译文融合或多翻译方法融合的方式来利用不同翻译引擎的优点,然而过往的工作没有考虑到如何利用用户在使用多引擎机器翻译所产生的数据来获取存在于用户认知域中对这些引擎译文的评价。[方法/过程]本文研究提出了基于六个翻译引擎的多引擎翻译平台。该平台在长期使用中产生了翻译结果、用户特征、人工校译等数据,本文基于以上大规模历史数据构建了翻译模型训练资源库,结合Page Rank算法、贝叶斯公式和UNQE方法提出了多引擎机器翻译译文重排序方法,并利用译文重排序的结果与翻译模型训练资源库中的翻译实例相关数据,进一步使用Transformer架构训练了译文融合模型。[局限]所提方法存在冷启动问题,需要一定时间、大量用户的真实数据才能够实现预期效果。[结果/结论]实验结果表明了本文提出的方法能够融合多引擎优势,提高不同领域的平均译文质量。 [Objective/Significance]Machine Translation(MT)engines trained with different models,methods,language and data have different performance for multiple specific translation scenario.Thus,a number of research tried to use multi-engine or multi-method combination approach for constructing MT system with advances of each MT engine.[Methods/Processes]This research provides a multi-engine platform with six different MT engines.During the long-term using of it,there comes a huge amount of data of translation instances,user profiles and human translates.A resource warehouse for translation model training is constructed using these data.we offer a method of multi-engine MT re-ranking using the resource warehouse with Page Rank Algorithm,Bayes Rule and UNQE.Furthermore,we use the result generated by the re-ranking method with human translations provided by the resource warehouse to train a translation combination model.[Limitations]This Method has cold boot problem which requires data generated within a period of time and by a number of users to reach our goals.[Results/Conclusions]The test result shows the method we provide can use advantages of multiple MT engines and improve translation eventually.
作者 李铭 张克亮 唐亮 夏榕璟 LI Ming;ZHANG Keliang;TANG Liang;XIA Rongjing(Information Engineering University(Luoyang),Luoyang 471003,China)
出处 《情报工程》 2023年第2期96-107,共12页 Technology Intelligence Engineering
关键词 多引擎机器翻译 译文重排序 译文融合 Multi-engine machine translation Translation re-ranking Translation combination
  • 相关文献

参考文献4

二级参考文献29

  • 1孙广范,宋金平,袁琦,肖健,单玉秋.中英可比语料库中翻译等价对抽取方法研究[J].计算机工程与应用,2007,43(32):44-46. 被引量:9
  • 2Matusov E,Ueffing N.Computing Consensus Translation from Multiple Machine Translation Systems Using Enhanced Hypotheses Alignment. Proc.of Conference of Association for Computational Linguistics . 2006
  • 3Rosti A V I,Matsoukas S,Schwartz R.Improved Word-level System Combination for Machine Translation. Proc.of the 45th Annual Meeting of the Association of Computational Linguistics . 2007
  • 4Creutz M,Lagus K.Induction of a Simple Morphology for Highly Inflecting Languages. Proc.of the 7th Meeting of the ACL Special Interest Group in Computational Phonology . 2004
  • 5B. Bangalore,G. Bordel,G. Riccardi.Computing consensus translation from multiple machine translation systems. Proceedings of Automatic Speech Recognition and Understanding . 2001
  • 6Sim K C,Byrne W J,Gales M J F,Sahbi H,Woodland P C.Consensus network decoding for statistical machine trans-lation system combination. Proceedings of the Interna-tional Conference on Acoustics,Speech and Signal Process-ing . 2007
  • 7Reinhard Rapp et al., Introduction to The Third Workshopon Hybrid Approaches to Translation, Proceedings of the 3rdWorkshop on Hybrid Approaches to Translation (HyTra) @EACL 2014,pages iii.
  • 8Rapha. el Rubino et al., Statistical Post-Editing of MachineTranslation for Domain Adaptation, Proceedings of the 16thEAMT Conference, p.221-228. 28-30 May 2012,Trento, Italy.
  • 9Simard, M.,C. Goutte, and P. Isabelle. 2007a. Statistical Phrase-based Post-editing. In NAACL-HLT, pages 508,515.
  • 10Isabelle, P., C. Goutte, and M. Simard. 2007. Domain adaptation of MTsystems through automatic postediting. In MT Summit XI, pages 255-261.

共引文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部