期刊文献+

面向形态丰富语言的多粒度翻译融合 被引量:3

System Combination with Multiple Granularities for Morphologically Rich Language Translation
下载PDF
导出
摘要 形态丰富语言由于其复杂的形态变化,会导致大词汇量和数据稀疏问题,这给统计机器翻译带来了巨大挑战。该文通过将这类语言表示为不同的粒度,然后分别进行翻译;由于不同的粒度能表征语言不同层面的特点,通过对不同粒度的翻译结果进行词级系统融合,便可生成更好的译文。维吾尔语、蒙古语到汉语的两组翻译实验表明,这种多粒度系统融合方法改善了翻译效果,BLEU值比最好的单系统分别提高了+1.41%和+2.03%。 Morphologically rich language,characterized by complex morphological changes,has huge vocabulary and serious data sparseness issue,which has brought a great challenge to machine translation.In this paper,we first analyze such language and use different granularities to represent and then translate them respectively.As different granularities can catch features of such language in different levels,we integrate the translation hypotheses from different granularities by the system combination approach to generate better results.Experimental results on Uyghur-Chinese and Mongolian-Chinese translation tasks show that system combination with multiple granularities improved the performance of translation,and gained +1.41% and +2.03% compared to the best single system measured by BLEU.
出处 《中文信息学报》 CSCD 北大核心 2011年第4期75-81,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金重点资助项目(60736014),国家自然科学基金资助项目(60873167)
关键词 形态丰富语言 多粒度 系统融合 morphologically rich language multiple granularities system combination
  • 相关文献

参考文献18

  • 1那顺乌日图,刘群,巴达玛敖德斯尔.面向机器翻译的蒙古语生成[C] //全国第六届计算语言学联合学术会议论文集.北京,2001:285-291.
  • 2Kishore Papineni, Salim Roukos, Todd Ward, and Wei Jing Zhu. BI.EU: a Method for Automatic Evaluation of Machine Translation[C]//Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, 2002 : 311-318.
  • 3Young Suk Lee. Morphological Analysis for Statistical Machine Translation [C]//Proceedings of HLT NAACL 2004, 2004:57-60.
  • 4Sonja NielSen and Hermann Ney. Statistical Machine Translation with Scarce Resources using Morpho syn tactic Information [J]. Computational Linguistics, 2004, 30:181-204.
  • 5Michael Collins, Philipp Koehn, and Ivona Ivona Ku? erova. Clause restructuring for statistical machine translation[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005,531-540.
  • 6Marine Carpuat, Yuval Marton, and Nizar Habash. Improving Arabic to English Statistical MachineTranslation by Reordering Post-verbal Subjects for A lignment[C]//Proceedings of the ACL 2010 Confer ence Short Papers, 2010: 178-183.
  • 7Dmitriy Genzel. Automatically I.earning Source side Reordering Rules for Large Scale Machine Translation [C]//Proceedings of the 23rd International Conference on Computational Linguistics, 2010 : 376-384.
  • 8Peng Xu, Jaeho Kang, Michael Ringgaard, Franz Jo sef Och. Using a Dependency Parser to Improve SMT for Subject-Object Verb Languages [C]//Proccedmgs of 2009 Annual Conference of the North American Chapter of the Association for Computational Linguis tics, 2009:245-253.
  • 9Philipp Koehn and Hieu Hoang. Factored Translation Models[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language I.earning, 2007 : 868-876.
  • 10C. Dyer, S. Muresan, and P. Resnik. Generalizing Word Lattice Translation[C]//Proceedings of ACI. 08: HLT, 2008:1012-1020.

共引文献1

同被引文献27

  • 1KOEHN P, OCH F J, MARCU D. Statistical phrase-based translation [ C ]//Proc of Conference of the North American Chapter of the Asso- ciation for Computational Linguistics on Human Language Technology. 2003 : 48- 54.
  • 2KOEHN P. Pharaoh:a beam search decoder for phrase-based statisti- cal machine translation models [ C ]//Proc of Machine Translation From Real Users to Research. 2004 : 115-124.
  • 3ZENS R, NEY H, WATANABE T, et al. Reordering constraints for phrase-based statistical machine translation[ C ]//Proc of the 20th In- ternational Conference on Computational Linguistics. 2004.
  • 4XIONG De-yi, LIU Qun, LIN Shou-xun. Maximum entropy based phrase reordering model for statistical machine translation [ C ]//Proc of COLING-ACL. 2006.
  • 5SCHROEDER J, KOEHN P. The University of Edinburgh system de- scription for IWSLT 2007 [ C ]//Proc of International Workshop on Spoken Language Translation. 2007.
  • 6HE Zhong-jun, MENG Yao, YU Hao. Maximum entropy based phrase reordering for hierarchical phrase-based translation [ C ]//Proc of Con- ference on Empirical Methods in Natural Language Processing. 2010.
  • 7KNIGHT K, YAMADA K. A computational approach to deciphering unknown scripts [ C ]//Proc of the ACL Workshop on Unsupervised Learning in Natural Language Processing. 1999:31 = 36.
  • 8ZHANG Le. Maximum entropy modeling toolkit for Python and C + + [EB/OL]. (2004). http://www, nlplab, cn/zhangle/maxent_tool- kit. html.
  • 9KOEHN P, HOANG H, BIRCH A. Moses : open source toolkit for sta- tistical machine translation [ C ]//Proc of Annual Meeting of the Asso- ciation for Computational Linguistics (ACL). 2007.
  • 10STOLCKE A, ZHENG Jing ,WANG Wen ,et al. SRILM at sixteen:up- date and outlook [ C ]//Proc of IEEE Workshop on Speech Recogni- tion and UnderStanding. 2011.

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部