摘要
神经机器翻译在许多任务上已取得了不错的效果,然而在低资源语种上的效果并不理想。针对这一问题,该文提出一种基于transformer的维汉机器翻译模型,该模型在transformer的基础上引入了循环机制和时间编码,使模型具有更好的泛化性和计算效率,同时在模型的输出端采用了beam search优化,针对模型的输出存在谐音字、混淆音字等问题采用了字粒度的语言模型困惑度进行评判。实验采用BLEU作为评价指标,实验表明,在少量维语与汉语的平行语料库的基础上,该文提出的改进transformer维汉机器翻译模型能够取得更好的效果,BLUE值提升了0.93%。
Neural machine translation has achieved good results in many tasks,however,it is not satisfactory in low-resource languages.To solve this problem,this paper proposes a transformer-based Uyghur-Chinese machine translation model.The model introduces the recurrent mechanism and time embedding on the basis of transformer,so that the model has better generalization and computational efficiency,and at the same time Beam search optimization is also used on the output side of the model.The language model perplexity of word granularity is used to judge the problems such as homophonic words and confused pronunciations in the output of the model.The experiment uses BLEU as the evaluation index;the experiment shows that on the basis of a small number of parallel corpora between Uyghur and Chinese,the improved Uyghur-Chinese machine translation model of the transformer proposed in this paper can achieve better results,and the BLUE value is increased by 0.93%.
作者
杜志昊
DU Zhihao(Wuhan Research Institute of Posts and Telecommunication,Wuhan 430070,China;Nanjing Fiberhome Tiandi Communication Technology Co.,Ltd.,Nanjing 210019,China)
出处
《电子设计工程》
2023年第22期47-51,共5页
Electronic Design Engineering