期刊文献+

基于分析和生成的复述与SMT语料扩展 被引量:3

Parse-realize based paraphrasing and SMT corpus enriching
下载PDF
导出
摘要 为了解决统计机器翻译语料对调序现象覆盖不足的问题,采用复述方法对语料进行扩展.提出了一种基于依存分析和句子生成的复述方法.对句子进行依存分析得到依存树,然后从依存树生成多个自然语言句子.生成的句子与原句相比没有词汇上的改变,但可以在词序方面进行变换.实验表明方法在不引入额外资源的前提下,有效缓解了语料覆盖不足的问题,提高了机器翻译质量. To resolve the low-coverage problem of the statistic machine translation training corpus,a dependency parsing and sentence realization based paraphrasing method is proposed.The input sentence is first parsed into a dependency tree,and then the tree is realized into multiple natural language sentences.Although the generated sentences have the same lexical words,the expressions of word orders are re-arranged.The experiments shows that the paraphrasing method can be used to enlarge the bilingual corpus for statistic machine translation and the method efficiently relieves the low-coverage problem of training corpora without any extra resources,finally the translation quality is improved.
作者 和为 刘挺
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2013年第5期45-50,共6页 Journal of Harbin Institute of Technology
基金 国家自然科学基金面上资助项目(61073126 61133012) 国家高技术研究发展计划重大资助项目(2011AA01A207)
关键词 复述 统计机器翻译 依存分析 句子生成 paraphrase statistic machine translation dependency parsing sentence realization
  • 相关文献

参考文献12

  • 1BARZILAY R, MCKEOWN K R. Extracting parap- hrases from a parallel corpus [ C ]//Proceedings of the 39th Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA : Association for Computational Linguistics, 2001 : 50 - 57.
  • 2KOEHN P, OCH F J, MARCU D. Statistical phrase- based translation [ C ]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language. Stroudsburg, PA : Association forComputational Linguistics, 2003 : 48 - 54.
  • 3HE Wei, ZHAO Shiqi, WANG Haifeng, et al. Enri- ching SMT training data via paraphrasing [ C ]/! Proceedings of the 5th International Joint Conference on Natural Language Processing. Chiang Mai, Thailand: IJCNLP, 2011 : 803 -810.
  • 4BOND F, NICHOLS E, APPLING D S, et al. Improving statistical machine translation by paraphrasing the training data [ C ]//Proceedings of the International Workshop on Spoken Language Translation (IWSLT). USA : Hawaii, 2008 : 150 - 157.
  • 5NAKOV P. Improved statistical machine translation using monolingual paraphrases [ C ]//Proceedings of the 2008 Cmfference on ECAI 2008: 18th European Conference on Artificial Intelligence. The Netherlands: IOS Press Amsterdam, 2008 : 338 - 342.
  • 6HE Wei, WANG Haifeng, GUO Yuqing, et al. De- pendency based Chinese sentence realization [ C 1// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Stroudsburg, PA : Association for Computational Linguistics, 2009:809 - 816.
  • 7COVINGTON M A. A fundamental algorithm for dependency parsing [ C ]//Proceedings of the 39th Annual ACM Southeast Conference. New York: ACM, 2001 : 95 - 102.
  • 8DU Jinhua, JIANG Jie, WAY A. Facilitating translation using source language paraphrase lattices [ C ]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA : Association for Computational Linguistics, 2010 : 420 - 429.
  • 9KOEN P, HOANG Hien, BIRCH A, et al. Moses: open source toolkit for statistical machine translation [ C // Proceedings of the d5th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. ACL Demo and Poster Sessions. Stroudsburg, PA : Association for Computational Linguistics, 2007:177 - 180.
  • 10OCH F J, NEY H. Improved statistical alignment models[ C ]//Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA : Association for Computational Linguistics, 2000 : 440 - 447.

同被引文献16

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部