摘要
汉越神经机器翻译是典型的低资源翻译任务,由于缺少大规模的平行语料,可能导致模型对双语句法差异学习不充分,翻译效果不佳。句法的依存关系对译文生成有一定的指导和约束作用,因此,该文提出一种基于依存图网络的汉越神经机器翻译方法。该方法利用依存句法关系构建依存图网络并融入神经机器翻译模型中,在Transformer模型框架下,引入一个图编码器,对源语言的依存结构图进行向量化编码,利用多头注意力机制,将向量化的依存图结构编码融入到序列编码中,在解码时利用该结构编码和序列编码一起指导模型解码生成译文。实验结果表明,在汉越翻译任务中,融入依存句法图可以提升翻译模型的性能。
Chinese-Vietnamese neural machine translation is a typical low-resource task.Due to the lack of large-scale parallel corpus,the model may not learn enough bilingual differences and the translation quality is not good.A Chinese-Vietnamese neural machine translation method based on dependency graph network is proposed.This method uses dependency syntactic relations to construct a dependency graph network and incorporates neural machine translation.In the framework of the Transformer,a graph encoder is introduced to capture the dependency structure diagram of the source language,which is then integrated with the sequence embedding via multi-head attention mechanism.When decoding,structured and sequence encoding are used to guide the decoder to generate translations.The experimental results show that in the Chinese-Vietnamese translation task,incorporating the dependency syntax graph can improve the performance of the translation model.
作者
普浏清
余正涛
文永华
高盛祥
刘奕洋
PU Liuqing;YU Zhengtao;WEN Yonghua;GAO Shengxiang;LIU Yiyang(School of Information Engineering and Automation,Kunming University of Science and Technology,Kunming,Yunnan 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming,Yunnan 650500,China)
出处
《中文信息学报》
CSCD
北大核心
2021年第12期68-75,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金(61732005,61761026,61672271)
国家重点研发计划(2019QY1802,2019QY1801)
关键词
低资源
依存句法
依存图
Low resources
Dependency syntax
Dependency graph