期刊文献+

基于序列相交的短语译文获取 被引量:3

Sequence Intersection Based Phrase Translation Extraction from Bilingual Corpus
下载PDF
导出
摘要 短语译文获取技术是基于实例的机器翻译(EBMT)中的核心技术之一,其准确率直接影响到EBMT系统的性能。该文提出了一种基于序列相交的短语译文获取方法,该方法将句子视为词的序列,利用对中日句对齐语料库中包含待译短语的所有源语句子对应的目标语句子进行序列相交的方式,在不需要词对齐、句法分析及词典等资源的情况下,通过充分挖掘句对齐双语语料库的信息,获得高质量的短语译文。实验表明,该方法获得的短语译文准确率超过80%。 Phrase translation extraction is one of the key techniques in the Example-Based Machine Translation (EBMT) ,and its accuracy has a direct influence on the EBMT system performance. This paper proposes a phrase translation extraction method based on sequence intersection in which the sentence is taken as word sequence. Among Chinese-Japanese sentence aligned bilingual corpus, the source sentences containing the phrase are first searched out. Then the pairwise intersections of all these target sentences are acquired as the phrase translaiton. This approach can achieve high quality phrase translations by mining the bilingual corpus, avoiding pre possing steps like word alignment, parsing and dictionary. The experiments show our method achieves over 80 % accuracy for the acquired phrase translation.
出处 《中文信息学报》 CSCD 北大核心 2009年第1期38-43,共6页 Journal of Chinese Information Processing
关键词 计算机应用 中文信息处理 EBMT 短语译文获取 序列相交 computer application Chinese information processing EBMT phrase translation extraction sequence intersection
  • 相关文献

参考文献9

  • 1Daniel Marcu, William Wong. A Phrase-based, Joint Probability Model for Statistical Machine Translation [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Philadelphia, PA, USA. July 2002.
  • 2Dekai WU. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora[J]. Computational Linguistics, 1997, 23(3): 377-404.
  • 3Ying Zhang, Stephan Vogel, Alex Waibel. Integrated phrase segmentation and alignment algorithm for statistical machine translation [ C ]//Proceedingof International Conference on Natural Language Processing and Knowledge Engineering. Beijing, 2003.
  • 4Ying Zhang, Stephan Vogel. Competitive Grouping in Integrated Phrase Segmentation and Alignment Model [C]//Proceeding of ACL Workshop on Building and Using Parallel Texts. Ann Arbor. 2005: 159-162.
  • 5H Kaji, Y Kida, Y Morimoto. Learning Translation Templates from Bilingual Texts [C]//Proceedings of the 14th International Conference on Computational Linguistics. Nantes France. 1992: 672-678.
  • 6Fram Josef Och, Hermann Ney. The alignment template approach to statistical machine translation [J]. Computational Linguistics, 2004, 30(40): 417- 449.
  • 7何彦青,周玉,宗成庆,王霞.基于“松弛尺度”的短语翻译对抽取方法[J].中文信息学报,2007,21(5):91-95. 被引量:6
  • 8刘冬明,赵军,杨尔弘.汉英双语语料库中名词短语的自动对应[J].中文信息学报,2003,17(5):6-12. 被引量:7
  • 9屈刚,陈笑蓉,陆汝占.基于有效句型的英汉双语短语对齐[J].计算机研究与发展,2003,40(2):143-149. 被引量:6

二级参考文献26

  • 1周强,俞士汶.汉语短语标注标记集的确定[J].中文信息学报,1996,10(4):1-11. 被引量:35
  • 2Xun E, ghou M, and Huang C. A Unified Statistical Modal for the Identification of English Base NP.The 38th Annual Meeting of the Association for Computational Linguistics [C], 2002.
  • 3Lance A. Ramshaw and Mitchell P. Marcus. Text Chunking Using Transformation-Based Learning.Proceedings of the Third ACL Workshop on Very Large Corpora [C], Cambridge MA, USA, 1995.
  • 4Jlian M. Kupiec. An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora. Proceedings of the 3Ist Annual Meeting of the ACL [ C] ,1993.
  • 5Smadja F, McKeown K. R and Hatzivassiloglou V. Translation Collocations for Bilingual Lexicons: A Statistical Approach [J] Computational Linguistics 1996,22(1) : 1 - 38.
  • 6Melamed I. D. Automatic Discovery of Non-Compositional Compounds. Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing [C], Providence, RI 1997.
  • 7Jianfeng Gao, Jian-Yun Nie. Improving Query Translation for Cross-language Information Retrieval Using Statistical Models Proceedings of the 24th annual international ACMSIGIR conference [C] 96 - 104,2001.
  • 8J Ker Jason, S Chang . Class-based approach to word alignment . Computational Linguistics, 1997, 23(2): 313~355
  • 9P F Brown, S A Della Pietra, V J Della Pietra et al .The mathematics of statistical machine translation: Parameter estimation . Computational Linguistics, 1993, 19(2): 263~311
  • 10H Kaji, Y Kida, Y Morimoto . Learning translation template from bilingual text . COLING-1992, Nantes, France, 1992

共引文献14

同被引文献43

  • 1孙宏林,俞士汶.浅层句法分析方法概述[J].当代语言学,2000,2(2):74-83. 被引量:38
  • 2张春祥,李生,赵铁军.基于中心语块扩展的短语对齐[J].计算机研究与发展,2006,43(9):1658-1665. 被引量:3
  • 3徐昉,宗成庆,王霞.中文Base NP识别:错误驱动的组合分类器方法[J].中文信息学报,2007,21(1):115-119. 被引量:7
  • 4Daniel Marcu, William Wong. A Phrase-based, Joint Probability Module for Statistical Machine Translation [C]//Proceedings of the ACL-02 Conference on Em- pirical Methods in Natural Language Processing. Mor- ristown, NJ, USA. Association for Computational Linguistics, 2002 : 133-139.
  • 5Dekai Wu. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora [J]. Computa- tional Linguistics, 1997,23 (3) : 377-403.
  • 6Ying Zhang, Stephan Vogel, and Alex Waibel. Inte- grated phrase segmentation and alignment algorithm for statistical machine translation [C]//Proceeding of International Conference on Natural Language Process- ing and Knowledge Engineering. Beijing, 2003: 567-573.
  • 7Ying Zhang, Stephan Vogel. Competitive Grouping in Integrated Phrase Segmentation and Alignment Model [C]//Proceeding of ACL Workshop on Building and Using Parallel Texts. Ann Arbor. 2005: 159-162.
  • 8H Kaji, Y Kida, and Y Morimoto. Learning Transla- tion Templates from Bilingual Text[C]//Proceedings of the 14th International Conference on Computational Linguistics. Nantes, France. 1992:672 -678.
  • 9Franz Josef Och, Hermann Ney. The alignment tem- plate approach to statistical machine translation[J]. Computational Linguistics, 2004,30 (4) : 417-449.
  • 10David Chiang. A Hierarchical Phrase-Based Model for Statistical Machine Translation [C]//Proceedings of the 43th Annual Meeting of the Association for Com- putational Linguistics. Ann Arbor. 2005 : 263-270.

引证文献3

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部