期刊文献+

基于数据扩充的翻译记忆库与神经机器翻译融合方法 被引量:6

Integrating Translation Memory into Neural Machine Translation via Data Augmentation
下载PDF
导出
摘要 神经机器翻译是目前机器翻译领域的主流方法,而翻译记忆是一种帮助专业翻译人员避免重复翻译的工具,其保留之前完成的翻译句对并存储在翻译记忆库中,进而在之后的翻译过程中通过检索去重用这些翻译。该文基于数据扩充提出两种将翻译记忆与神经机器翻译相结合的方法:(1)直接拼接翻译记忆在源语句后面;(2)通过标签向量拼接翻译记忆。该文在中英与英德数据集上进行了实验,实验表明,该方法可以使翻译性能获得显著提升。 Neural machine translation is currently the most popular method in the field of machine translation,while translation memory is a tool to help professional translators avoid repeated translations.This paper proposes two methods to integrate the translation memory into neural machine translation via data augmentation:(1)directly stitching translation memory after the source sentence;(2)stitching translation memory by tag embedding.Experiments on Chinese-English and English-German datasets show that proposed methods can achieve significant improvements.
作者 曹骞 熊德意 CAO Qian;XIONG Deyi(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
出处 《中文信息学报》 CSCD 北大核心 2020年第5期36-43,共8页 Journal of Chinese Information Processing
基金 国家重点研发计划(2019QY1802)
关键词 神经机器翻译 翻译记忆 数据扩充 neural machine translation translation memory data augmentation
  • 相关文献

参考文献1

二级参考文献20

  • 1Brown, P.F., J. Cocke, S.A. Della Pietra, V.J. Della Pietra, F. Jelinek, J.D. Lafferty, R.L. Mercer, and P.S. Roossin. 1990. A statistical approach to machine translation. Proceedings of the Workshop on Speech and Natural Language-ACE Pp. 146-51.
  • 2Brown, P.F. , S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. 1993. The mathematics of statistical machine translation : Parameter estimation. Computational Linguistics 19,2:263 - 311.
  • 3Carpuat, M. and D.K. WU. 2007. Improving statistical machine translation using word sense disambiguation. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007). Pp. 61-72.
  • 4Chan, Y.S. , H.T. Ng, and D. Chiang. 2007. Word sense disambiguation improves statistical machine translation. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics ( ACL 2007). Pp. 33 - 40.
  • 5Chiang, D. 2005. A hierarchical phrase-based model for statistical machine translation. Proceedings of ACL 2005. Pp. 263 -70.
  • 6Chiang, D. 2007. Hierarchical phrase-based translation. Computational Linguistics 33,2:201-28.
  • 7Galley, M., J. Graehl, K. Knight, D. Marcu, S. DeNeefe, W. Wang, and I. Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of Association for Computational Linguistics (ACL 2006). Pp. 961 -8.
  • 8Koehn, P. 2004. Pharaoh: A beam search decoder for phrase-based statistical machine translation models. Proceedings of the 6th Conference of the Association for Machine Translation in the Americas ( AMTA 2004). Pp. 115 -24.
  • 9Koehn, P., F.J. Och, and D. Marcu. 2003. Statistical phrase-based translation. Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference. Pp. 127 - 33.
  • 10Liu, Y. , Q. Liu, and S. Lin. 2006. Tree-to-string alignment template for statistical machine translation. Proceedings of COLING/ACL 2006. Pp. 609-16.

共引文献41

同被引文献54

引证文献6

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部