
面向专利领域的汉英机器翻译融合系统 被引量:7

A Hybrid System for Chinese-English Patent Machine Translation
摘要 面向专利领域的机器翻译近年来已成为机器翻译的重要应用领域之一。本文提出了一个汉英专利文本机器翻译融合系统,该系统以规则系统为主导搭建,并把规则翻译方法和基于短语的统计翻译系统相结合。在融合系统中,规则系统主要负责源语言的分析和转换阶段的处理,生成相应的源语言句法分析树与转换树,并确定目标语言的基本句法框架。统计翻译系统则在目标语生成阶段根据生成的目标语句法结构寻找合适的对译词形,并产生最终的候选译文。通过利用自动评测指标对融合系统进行测试,融合系统的结果均优于单个规则系统和统计系统的结果,表明了融合方法的有效性和可行性,可以改善系统的翻译性能,提高翻译质量。 Machine translation towards patent domain has translation in recent years. This paper presented a nove become one important application of machine hybrid system, which combines rule-based machine translation (RBMT) with phrase-based statistical machine translation (SMT), to translate Chinese patent texts into English. The hybrid architecture is basically guided by the RBMT engine which processes source language parsing and transformation, generating proper syntactic trees for the target language. In the generation stage, the SMT subsystem provides proper lexical selection according to the generated grammatical structures and produces final translation. According to the experimental evaluation, the hybrid approach outperforms each individual system over sets of automatic evaluation metrics, indicating that the proposed method performs well in improving translation results.
作者 李洪政 赵凯 胡韧奋 蒋宏飞 朱筠 晋耀红 LI HongZheng ZHAO Kai HU RenFen JIANG HongFei ZHU Yun JIN YaoHong(Institute of Chinese Information Processing, Beijing Normal University, Beijing 100875, China Beijing Qihu Technology Co., Ltd., Beijing 100015, China Beijing Dinfo Technology Co., Ltd., Beijing 100107, China)
出处 《情报工程》 2017年第3期105-115,共11页 Technology Intelligence Engineering
基金 国家高技术研究发展计划基金项目"海量文本多层次知识表示及中文文本理解应用系统研制"(2012AA011104)的资助
关键词 专利 规则方法 统计方法 融合系统 机器翻译 Patent, rule-based, statistical-based, hybrid system, machine translation
  • 相关文献



  • 1R. Frederking, S. Nirenburg. Three heads are better than one[C]//Proeeedings of the fourth Conference on Applied Natural Language Processing. 1994: 95-100.
  • 2J. G. Fiscus. A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)[C]//IEEE Workshop on Automatic Speech Recognition and Understanding. 1997: 347-354.
  • 3S. Bangalore, F. Bordel, G. Riccardi. Computing consensus translation from multiple machine translation systems[C]//IEEE Workshop on Automatic Speech Recognition and Understanding. ASRU'01, 2001:351-354.
  • 4S. Kumar, W. Byrne. Minimum bayes-risk decoding for statistical machine translation [C]//Proc. HLTNAACL. Boston, MA, USA, 2004:196-176.
  • 5A.-V. I. Rosti, N. F. Ayan, B. Xiang, et al. Combining outputs from multiple machine translation sys- tems[C]//Proceedings of NAACL HLT. Rochester, NY, 2007: 228-235.
  • 6K. Papineni, S. Roukos, T. Ward, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ( ACL 2002). Philadelphia, PA, 2002:311-318.
  • 7M. Snover, B. Dorr, R. Schwartz, et al. A study of translation edit rate with targeted human annotation [C]//Proceedings of the 7th Conference of the Association for Machine Translation in the Americas. Cambridge, 2006: 223-231.
  • 8F. J. Och, H. Ney. A systematic comparison of various statistical alignment models[J]. Computational Linguistics. 2003, 29(1):19-51.
  • 9P. Koehn, H. Hoang, A. Birch, et al. Moses; Open Source Toolkit for Statistical Machine Translation[C]//Proceedings of the ACL 2007 Demo and Poster Sessions. Prague, 2007: 177-180.
  • 10F. Huang, K. Papineni. Hierarchical system combination for machine translation[C]//Proeeedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Prague, 2007: 277-286.












使用帮助 返回顶部