摘要
神经机器翻译(NMT)已经在很多翻译任务上获得了显著的提升效果,但它需要大规模的平行语料库作为支撑,而高质量的平行语料库往往很难获取。针对这一问题,提出了一个多语言无监督NMT模型,该模型使用一个共享编码器和多个解码器联合训练多种语言。实验结果表明,多语言无监督NMT模型方法比双语无监督NMT基线模型表现更好,在WMT测试集上,BLEU值最大提高了1.48%,并且此模型还可实现对训练过程中不可见语言对之间的翻译。
Neural Machine Translation(NMT)has achieved significant improvements in many translation tasks,but it requires a large-scale parallel corpus as support,and high-quality parallel corpus is often difficult to obtain.To solve this problem,a multilingual unsupervised NMT model is proposed,which uses a shared encoder and multiple decoders to jointly train multiple languages.The experimental results show that the multilingual unsupervised NMT model method performs better than the bilingual unsupervised NMT baseline model,and the BLEU value on the WMT test set is increased by 1.48%;and this model can also achieve translation between invisible language pairs during the training process.
作者
文丽颖
WEN Liying(Wuhan Research Institute of Posts and Telecommunications,Wuhan 430000,China;Nanjing Fiberhome World Communication Technology Co.,Ltd.,Nanjing 210000,China)
出处
《电子设计工程》
2021年第20期48-51,56,共5页
Electronic Design Engineering
关键词
神经机器翻译
无监督
多语言
跨语言词嵌入
neural machine translation
unsupervised
multilingual
cross-lingual word embedding