期刊文献+

基于掩码机制的非自回归神经机器翻译 被引量:3

Masking mechanism based non-autoregressive neural machine translation
下载PDF
导出
摘要 当前基于自注意力机制的神经机器翻译模型取得了长足的进展,但是采用自回归的神经机器翻译在解码过程中无法并行计算,耗费时间过长.为此,提出了一个采用非自回归的神经机器翻译模型,可以实现并行解码,并且只使用一个Transformer的编码器模块进行训练,简化了传统的编码器-解码器结构.同时在训练过程中引入了掩码机制,减小了与自回归的神经机器翻译的翻译效果差距.相比于其他非自回归翻译模型,该模型在WMT 2016罗马尼亚语-英语翻译任务上取得了更好的效果,并且在使用跨语言预训练语言模型初始化后,取得了和自回归神经机器翻译模型相当的结果. At present,the neural machine translation model based on self-attention mechanism has made great progress.However,the neural machine translation based on autoregressive algorithm can not perform parallel computation in the decoding process,resulting in consuming too much time.We propose a non-autoregressive neural machine translation model,which can realize parallel computing.Only one encoder module of Transformer is used for training,thus simplifying the traditional encoder-decoder structure.At the same time,in the training process,we introduce a mask mechanism to reduce the gap between non-autoregressive neural machine translation and autoregressive neural machine translation.Compared with other non-autoregressive translation models,we have achieved more satisfactory results in WMT 2016 Romanian-English translation tasks,and achieved performances comparable to autoregressive translation models when initialized with cross-lingual pre-trained language models.
作者 贾浩 王煦 季佰军 段湘煜 张民 JIA Hao;WANG Xu;JI Baijun;DUAN Xiangyu;ZHANG Min(School of Computer Science and Technology,Soochow University,Suzhou 215006,China)
出处 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2021年第4期648-654,共7页 Journal of Xiamen University:Natural Science
基金 国家自然科学基金(61673289)。
关键词 神经机器翻译 掩码机制 非自回归 neural machine translation masking mechanism non-autoregressive
  • 相关文献

参考文献3

二级参考文献15

  • 1俞士汶等.机器翻译译文质量自动评估系统[A]..中国中文信息学会1991年会论文集[C].,.314—319.
  • 2Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, Paul S. Roossin, A Statistical Approach to Machine Translation [J],Computational Linguistics, 1990.
  • 3Peter. F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, Robert L. Mercer, The Mathematics of Statistical Machine Translation: Parameter Estimation [J], Computational Linguiatics, 19,(2), 1993.
  • 4F. J. Och, C. Tillmann, and H. Ney. Improved alignment models for statistical machine translation[A]. In Proc. of the Joint SIGDAT Conf. On Empirical Methods in Natural Language Processing and Very Large Corpora, pages 20-28, University of Maryland, College Park, MD, June 1999.
  • 5Franz Josef Och, Hermann Ney. What Can Machine Translation Learn from Speech Recognition? [A]In: proceedings of MT 2001 Workshop: Towards a Road Map for MT, 26-31, Santiago de Compostels,Spain, September 2001.
  • 6Franz Josef Och, Hermann Ney, Discriminative Training and Maximum Entropy Models for Statistical Machine Translation [A], ACL2002.
  • 7K. A. Papineni, S. Roukos, and R. T. Ward. Feature-based language understanding[A]. In European Conf. on Speech Communication and Technology, 1435-1438, Rhodes, Greece, September,1997.
  • 8K. A. Papineni, S. Roukos, and R. T. Ward. Maximum likelihood and discriminative training of direct translation models [A] In Proc. Int. Conf. on Accoustics, Speech, and Signal Processing,pages,189-192, Seattle, WA, May, 1998.
  • 9Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, Bleu: a Method for Automatic Evaluation of Machine Translation [R], IBM Research, RC22176 (W0109-022) September 17, 2001.
  • 10Ye-Yi Wang, Grammar Inference and Statistical Machine Translation [D], Ph.D Thesis, Carnegie Mellon University, 1998.

共引文献247

同被引文献29

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部