摘要
由于内蒙古地区蒙汉机器翻译水平落后、平行双语语料规模较小,利用传统的统计机器翻译方法会出现数据稀疏以及训练过拟合等问题,导致翻译质量不高。针对这种情况,提出基于LSTM的蒙汉神经机器翻译方法,通过利用长短时记忆模型构建端到端的神经网络框架并对蒙汉机器翻译系统进行建模。为了更有效地理解蒙古语语义信息,根据蒙古语的特点将蒙古文单词分割成词素形式,导入模型,并在模型中引入局部注意力机制计算与目标词有关联的源语词素的权重,获得蒙古语和汉语词汇间的对齐概率,从而提升翻译质量。实验结果表明,该方法相比传统蒙汉翻译系统提高了翻译质量。
Due to the small scale of Mongolian-Chinese bilingual parallel corpus and problems such as sparse data and over fitting of data training,the translation quality of traditional statistical machine translation methods for Mongolian-Chinese translation needs to be improved.In view of this situation,we propose a Mongolian-Chinese neural machine translation method based on LSTM.It constructs an end-to-end neural network frame by using the long and short memory model and models the Mongolian-Chinese machine translation system.In order to understand Mongolian sematic information more effectively,Mongolian words are divided into morphemes according to the characteristics of Mongolian language,which are then introduced into the model.Besides,the local attention mechanism is introduced into the model to calculate the weight of the source morphemes that are associated with the target word to achieve the probability of alignment between Mongolian and Chinese vocabularies and improve the translation quality.Experimental results show that compared with the traditional Mongolian-Chinese translation system,the proposed method obtains better translation quality.
作者
刘婉婉
苏依拉
乌尼尔
仁庆道尔吉
LIU Wan-wan;SU Yi-la;WU Ni-er;RENQING Dao-er-ji(College of Information Engineering,Inner Mongolia University of Technology,Hohhot 010080,China)
出处
《计算机工程与科学》
CSCD
北大核心
2018年第10期1890-1896,共7页
Computer Engineering & Science
基金
国家自然科学基金(61363052
61502255)
内蒙古自治区自然科学基金(2016MS0605)
内蒙古民族事务委员会基金(MW-2017-MGYWXXH-03)
关键词
注意力
端到端模型
机器翻译
蒙汉
LSTM神经网络
attention
end-to-end model
machine translation
Mongolian-Chinese
LSTM neural network