期刊文献+

译文语序的领域性思考:一种融合主题信息的领域自适应调序模型

Domain Adaptation of Reordering Model via Topic Information:Word Order in Translated Text across Domains
下载PDF
导出
摘要 领域自适应研究的目标是建立一种动态调整翻译模型,使翻译模型对目标领域的语言特征具备较强的学习和处理能力,借以保证翻译系统在不同领域获得平衡可靠的翻译能力。现有翻译模型的自适应研究已经取得显著进展,但调序过程的领域适应性研究相对较少。在该文前期工作中通过对大规模源语言和目标语言的真实互译样本统计发现,在语义等价的短语级互译对子中,36.17%的样本在不同领域中的语序存在显著差异。针对这一问题,该文从主题角度出发,探索不同主题分布下的短语调序差异,提出一种融合主题信息的领域自适应调序模型。实验结果显示,嵌入调序适应性模型的翻译系统取得了较为明显的性能优势。 The research on domain adaptation(DA)for statistical machine translation(SMT)aims at dynamically adjusting the translation model to ensure balanced and reliable translation quality in different domains.Existing researches on adaptation of translation model have made remarkable progress,but neglect the reordering issue.This paper investigates the translation samples in a large scale source bilingual corpus,revealing that 36.17% samples exhibits clear word order differences in phrase level translation pairs.Therefore,we propose a domain adaptive reordering model based on fusing topic information,to explore the reordering differences of phrases under different topic distribution.Experimental results show that translation systems with adaptive reordering model yield obvious performance improvements.
出处 《中文信息学报》 CSCD 北大核心 2017年第5期50-58,共9页 Journal of Chinese Information Processing
基金 国家自然科学基金(61373097 61672368 61672367 61331011) 江苏省科技计划(SBK2015022101) 教育部-中国移动科研基金(MCM20150602)
关键词 统计机器翻译 领域适应性 调序模型 主题模型 statistical machine translation domain adaptation reordering model topic model
  • 相关文献

参考文献6

二级参考文献82

  • 1陈毅东,史晓东,周昌乐.平行语料库处理初探:一种排序模型[J].中文信息学报,2006,20(B03):66-70. 被引量:4
  • 2魏瑞斌.基于关键词的情报学研究主题分析[J].情报科学,2006,24(9):1400-1404. 被引量:136
  • 3Peter. F. Brown, Stephen A. Della Pietra, Vincent J Della Pietra, Vincent J. Della Pietra, Robert L. Mercer, The Mathematics of Statistical Machine Translation: Parameter Estimation [J]. Computational Linguisitics, 1993,19(2) :263-312.
  • 4Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translateion[C]//Proceedings of HLT-NAACL 2003: 127-133.
  • 5Franz Josef Och and Hermann Ney. Discrimitive training and maximum entropy models for statistical machine translation [C]//Proceedings of ACL 2002, 2002 : 295-302.
  • 6Matthias Eck, Stephan Vogel, Alex Waihel. Language model adaptation for statistical machine translation based on information retrieval[C]//International Conference on Language Resources and Evaluation, 2004.
  • 7Bing Zhao, Matthias Eck, Stephan Vogel. Language Model Adaptation for Statistical Machine Translation ria structured query modes [C]//Proc. of COLING, 2004: 411-417.
  • 8Almut Silja Hildebrand et al, Adaptation of the Trans lation Model for Statistical Machine Translation based on Information Retrieval [C]//Proc. of EAMT 2005, 2005: 133-142.
  • 9Nicola Ueffing, Gholamreza Haffari and Anoop Sarkar. Semi-superivesed Model Adaptation for Statistical Machine Translation[J]. Machine Translation, 2008, 21(2): 77-94.
  • 10Yajuan Lu, Jin Huang. Improving Statistical Machine Translation Performance by Training Data Selection and Optimization[C]//International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2007: 343-350.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部