期刊文献+

融入汉字字形特征的中英神经机器翻译模型 被引量:7

Integrating Glyph Features of Chinese Character into Chinese-English Neural Machine Translation Model
下载PDF
导出
摘要 神经机器翻译技术是目前机器翻译应用中取得效果最好的方法。将外部语言学知识如单词词性、依存句法标签引入神经机器翻译系统以提高翻译性能已经被很多学者证明是一种行之有效的途径。相较于其他表音文字,汉字是一种形声字,其构造方法具有一半表音、一半表意的特殊结构,这种特殊的构造法使得汉字含有丰富的语义、语音和句法信息。该文在Marta R等工作的基础上,提出了一种新的将字形特征融入端到端模型的方法,并将之应用于中文到英文的翻译上。与基准系统相比,该方法在NIST评测集上获得平均1.1个点的显著提升,有效地证明了汉字字形特征可以对神经机器翻译模型起到促进作用。 The technology of neural machine translation is currently the best way to achieve the state-of-the-art results in application. Introducing external linguistic knowledge such as part-of-speech and dependency syntax tags into the neural machine translation system to improve translation performance has been proved effective. Compared with other phonetic characters,Chinese is a kind of semantic-phonetic compound character,which not only has the function of pronunciation but also contains semantic information. We propose a new method of incorporating glyph features into the end-to-end model based on the work of Marta R,et al,applying it to Chinese-English translation. Compared with the benchmark system, this method achieves a significant increase of 1.1 points in average on the NIST evaluation set, demonstrating that the glyph features of Chinese character can improve the neural machine translation model effectively.
作者 蔡子龙 熊德意 CAI Zilong;XIONG Deyi(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006 ,China)
出处 《中文信息学报》 CSCD 北大核心 2019年第5期75-81,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金(61622209 61861130364)
关键词 神经机器翻译 汉字字形特征 端到端模型 neural machine translation glyph feature end-to-end model
  • 相关文献

参考文献3

二级参考文献22

共引文献50

同被引文献68

引证文献7

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部