摘要
当前基于自注意力机制的神经机器翻译模型取得了长足的进展,但是采用自回归的神经机器翻译在解码过程中无法并行计算,耗费时间过长.为此,提出了一个采用非自回归的神经机器翻译模型,可以实现并行解码,并且只使用一个Transformer的编码器模块进行训练,简化了传统的编码器-解码器结构.同时在训练过程中引入了掩码机制,减小了与自回归的神经机器翻译的翻译效果差距.相比于其他非自回归翻译模型,该模型在WMT 2016罗马尼亚语-英语翻译任务上取得了更好的效果,并且在使用跨语言预训练语言模型初始化后,取得了和自回归神经机器翻译模型相当的结果.
At present,the neural machine translation model based on self-attention mechanism has made great progress.However,the neural machine translation based on autoregressive algorithm can not perform parallel computation in the decoding process,resulting in consuming too much time.We propose a non-autoregressive neural machine translation model,which can realize parallel computing.Only one encoder module of Transformer is used for training,thus simplifying the traditional encoder-decoder structure.At the same time,in the training process,we introduce a mask mechanism to reduce the gap between non-autoregressive neural machine translation and autoregressive neural machine translation.Compared with other non-autoregressive translation models,we have achieved more satisfactory results in WMT 2016 Romanian-English translation tasks,and achieved performances comparable to autoregressive translation models when initialized with cross-lingual pre-trained language models.
作者
贾浩
王煦
季佰军
段湘煜
张民
JIA Hao;WANG Xu;JI Baijun;DUAN Xiangyu;ZHANG Min(School of Computer Science and Technology,Soochow University,Suzhou 215006,China)
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2021年第4期648-654,共7页
Journal of Xiamen University:Natural Science
基金
国家自然科学基金(61673289)。
关键词
神经机器翻译
掩码机制
非自回归
neural machine translation
masking mechanism
non-autoregressive