

Three Ways to Incorporate Bilingual Phrases into Dependency-to-String Model
摘要 依存树到串模型使用基于HDR片段的翻译规则。HDR片段是由中心词及其所有依存节点组成的树片段。这种翻译规则可以较好地捕捉语言中的句子模式和短语模式等组合现象,但在捕捉非组合现象(如习惯用语或固定搭配)方面存在不足。这类非组合现象易于由短语捕捉。为了更好地改善依存树到串模型的性能,本文提出了三种引入双语短语的方法,分别为引入句法短语、引入泛化句法短语及引入非句法短语。实验结果表明,同时使用句法短语、泛化句法短语及非句法短语时,可以将依存树到串模型的性能显著提高约1.0BLEU值。 Dependency-to-String model makes use of translation rules based on head-dependents relations, which con- sists of a head and all its dependents. This model is good at capturing sentence patterns and phrase patterns in the source language, but fails in capturing non-compositional phenomena(such as idiom and collocation)that can be cap- tured easily by phrases. In order to better improve the performance, we propose three ways to incorporate syntactic phrases, generalized syntactic phrases and non-syntactic phrases into this model. Experiments show that this model gains up to about 1.0 BLEU score by incorporating these three kinds of phrases.
作者 谢军 刘群
出处 《中文信息学报》 CSCD 北大核心 2014年第2期44-50,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金重点项目(60736014) 国家自然科学基金项目(60873167 90920004) 863重点项目(2011AA01A207)
关键词 统计机器翻译 依存树到串模型 泛化句法短语 非句法短语 statistical machine translation Dependency-to-String Model generalized syntactic bilingual phrases non-syntactic bilingual phrases
  • 相关文献


  • 1Andreas Stolcke. Srilm--an extensible language mod- eling toolkit[C]//Proceedings of ICSLP, 2002, 30: 901-904.
  • 2Huihsin Tseng, Pichuan Chang, Galen Andrew, et al. A Conditional Random Field Word Segmenter[C]// Proceedings of Fourth SIGHAN Workshop on Chinese Language Processing.
  • 3Jun Xie, Haitao Mi, Qun Liu. A novel dependency-to- string model for statistical machine translation[C]// Proceedings of EMNLP 2011 : 216-226.
  • 4Chris Quirk, Arul Menezes, Colin Cherry. Dependen- cy treelet translation: Syntactically informed phrasal smt [C]//Proceedings of ACL 2005 : 271-279.
  • 5Yang Liu, Qun Liu, Shouxun Lin. Tree-to-string a- lignment template for statistical machine translation [C]//Proceedings of ACL 2006: 609-616.
  • 6Heidi J. Phrasal cohesion and statistical machine translation[C]//Proceedings of EMNLP 2002: 304- 311.
  • 7Franz Josef Och, Hermann Ney. A systematic com- parison of various statistical alignment models [J]. Computational Linguistics, 2003, 29(1) :19-51.
  • 8Franz Josef Och. Minimum error rate training in- statistical machine translation [C]//Proceedings of ACL 2003: 160-167.
  • 9Kishore Papineni, SalimRoukos, Todd Ward, WeiJing Zhu. Bleu: a method for automatic evaluation of machine translation [C]//Proceedings of ACL 2002: 311-318.
  • 10Dan Klein, Christopher D. Manning. Fast exact infer- ence with a factored model for natural language par- sing [C]//Proceedings of Advances in Neural Infor- mation Processing Systems 15 NIPS, 2003:3-10.








使用帮助 返回顶部