期刊文献+

基于数据增强的藏汉神经机器翻译研究 被引量:3

Research on Tibetan-Chinese Neural Machine Translation Based on Data Enhancement
下载PDF
导出
摘要 藏汉机器翻译有利于加强民族团结,有利于推进藏文信息化技术发展与突破不同语言之间的语言壁垒。藏汉神经机器翻译已经在很多翻译任务上获得了显著的提升效果,但它需要大规模的平行语料库作为支撑,而平行语料一直以来都面临着低资源语种匮乏的困境。论文希望通过同义词替换和回译两种数据增强策略的研究,为低资源条件下的藏汉机器翻译提供一个研究思路,从而促进藏区社会的发展。通过测试,藏汉机器翻译平均提升了4.59个BLEU值。 Tibetan-Chinese machine translation is conducive to strengthening national unity,promoting the development of Tibetan information technology and breaking through the language barriers between different languages.Tibetan-Chinese neural machine translation has achieved remarkable improvement in many translation tasks,but it needs large-scale parallel corpus as support,and parallel corpus has been faced with the dilemma of low resource language shortage.This paper hopes to provide a research idea for Tibetan-Chinese machine translation under low resource conditions by studying two data enhancement strategies of synonym replacement and back translation,so as to promote the development of Tibetan society.By testing,Tibetan-Chinese machine translation increased by an average of 4.59 BLEU values.
作者 杨丹 孙义栋 拥措 YANG Dan;SUN Yidong;YONG Cuo(School of Information Science and Technology,Tibet University,Lhasa 850000;State Key Laboratory of Artificial Intelligence for Tibetan Information Technology in Tibet Autonomous Region,Lhasa 850000;Ministry of Education Engineering Research Center for Tibetan Information Technology,Lhasa 850000)
出处 《计算机与数字工程》 2022年第11期2473-2477,共5页 Computer & Digital Engineering
基金 国家重点研发计划项目“藏文文献资源数字化技术集成与应用示范”(编号:No.2017YFB1402200) 西藏自治区科技创新基地自主研究项目(编号:XZ2021JR002G)资助。
关键词 藏汉神经机器翻译 数据增强 同义词替换 回译 Tibetan-Chinese neural machine translation data augmentation synonym replacement back translation
  • 相关文献

参考文献6

二级参考文献44

共引文献168

同被引文献20

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部