统计机器翻译中的非连续短语模板抽取及其应用被引量：2

Extraction and Application of Discontinuous Phrase Templates in Statistical Machine Translation

下载PDF

导出

摘要目前基于短语的统计机器翻译模型很少将非连续短语的情况考虑在内,由此造成翻译结果在目标语言中的意义变化或缺失。以非连续介词短语为例,提供了一种短语模板抽取算法。首先采用基于规则的方法,抽取出中文非连续介词短语模板,而后借助双语对齐语料和介词_方位词翻译表,获得模板对应的英文翻译。最终形成的双语模板被加入短语翻译表中。在标准测试语料上的对比实验表明,加入非连续短语模板后,译文更加符合语法规范,而翻译结果也取得了相对稳定的提高。 The discontinuous phrases are seldom taken into account in present phrase-based statistical machine translation models, which leads to the distortion or omission of translation results. This paper took discontinuous preposition phrases for example and proposed a phrase template extraction algorithm. It first extracted phrase templates from chi- nese corpus based on some specified rules, and then got their english translations with a bilingual alignment corpus and a preposition and location-word translation table. The generated bilingual templates were then added into the translation table. Comparative experiments in standard test corpus indicate that when these discontinuous phrase templates are applied in the translation system, the resulted translations are well comply with grammar specifications, and the translation quality is also improved.

作者孙越恒段楠侯越先

机构地区天津大学计算机科学与技术学院

出处《计算机科学》 CSCD 北大核心 2009年第10期192-196,共5页 Computer Science

基金国家自然科学基金项目(60603027) 微软亚洲研究院(MSRA)资助

关键词统计机器翻译短语模板非连续介词短语模板抽取 Statistical machine translation, Phrase template, Discontinuous preposition phrases, Template extraction

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1Brown P F,Pietra S A D, Pietra V J D, et al. The Mathematics of Statistical Machine Translation: Parameter Estimation [J]. Computational Linguistics. Computational Linguistics, 1993, 19 (2) :263-311.
2Oeh F J, Tillmann C, Ney H. Improved Alignment Models for Statistical Machine Translation [C]//Proceedings of the Joint Conference of Empirical Methods in Natural Language Processing and Very Large Corpora. University of Maryland, College Park, 1999 : 20-28.
3Yamada K, Knight K. A Syntax - based Statistical Translation Model. Proceedings [C]//the 39th Annual Meeting of the Association of Computational Linguistics. New Brunswick, NJ, 2001 : 132-139.
4Marcu D,Wong W. A Phrase-Based,Joint Probability Model for Statistical Machine Translation [C] // Proceedings the Conference on Empirical Methods in Natural Language Processing. Philadelphia, USA, 2002 : 133-139.
5Koehn P, Och F J, Marcu D. Statistical Phrase-Based Translation[C]//Proceedings the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Edmonton, Canada, 2003 : 127-133.
6Simard M, Cancedda N, Cavestro B, et al. Translating with noncontiguous phrases[C]//Proceedings Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP). Vancouver, British Columbia, Canada, 2005:755-762.
7张大鲲,张玮,冯元勇,孙乐.基于非连续短语的统计翻译模型研究[J].中文信息学报,2007,21(1):101-108. 被引量：5
8Och F J, Ney H. A Comparison of Alignment Models for Statistical Machine Translation[C]//Proceedings The 18th Conf.on Computational Linguistics. 2000 : 1086-1090.

二级参考文献17

1侯宏旭,刘群,张玉洁,井佐原均.2005年度863机器翻译评测方法研究与实施[J].中文信息学报,2006,20(B03):7-18. 被引量：6
2Antoine Doucet and Helena Ahonen-Myka.Non-Contiguous Word Sequences for Information Retrieval[A].In:Proceedings of the 42nd annual meeting of the Association for Computational Lingustics,Workshop on Multiword Expressions:Integrating Processing[C].2004.
3Michel Simard,et al.Translating with non-contiguous phrases[A].In:Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP)[C].2005.Vancouver.
4David Chiang.A Hierarchical Phrase-Based Model for Statistical Machine Translation[A].In:Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics[C].2005.Ann Arbor.
5Franz Josef Och.Minimum Error Rate Training in Statistical Machine Translation[A].In:Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics[C].2003.
6Franz Josef Och,Statistical Machine Translation:From Single-Word Models to Alignment Templates[M].2002.
7Richard Zens,Franz Josef Och,and Hermann Ney.Phrase-Based Statistical Machine Translation[A].In:25th Annual German Conference on Artificial Intelligence (KI2002),volume 2479 of Lecture Notes in Artificial Intelligence (LNAI)[C],2002,Springer Verlag.p.18-32.
8Franz Josef Och and Hermann Ney,A Systematic Comparison of Various Statistical Alignment Models[J].Computational Linguistics,2003.29(1):p.19-51.
9Cyril Goutte,Kenji Yamada,and Eric Gaussier.Aligning Words using Matrix Factorisation[A].In:Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics[C].2004.Barcelona,Spain.
10Philipp Koehn.Pharaoh:A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models[A].In:Proceedings of the Sixth Conference of the Association for Machine Translation in the Americas[C].2004.

共引文献4

1王思丽,祝忠明.机构知识库相关性检索机制研究与试验[J].情报科学,2020,0(2):94-101. 被引量：1
2骆凯,李淼,乌达巴拉,杨攀,朱海.汉蒙翻译模型中的依存语法与形态信息应用研究[J].中文信息学报,2009,23(6):98-104. 被引量：5
3禹龙,田生伟,杨飞宇.汉维语短语搭配的识别和对齐[J].计算机应用与软件,2011,28(6):43-46.
4张贯虹,高玲玲.一种基于统计和模板的双层翻译研究[J].电脑知识与技术,2008,0(11Z):1247-1249.

同被引文献19

1杨宪泽,雷开彬,吴守宪,张上游,宁爱华.一种句型转换和近似机器翻译方法及算法[J].计算机工程与科学,2005,27(11):66-68. 被引量：7
2刘康龙,穆雷.语料库语言学与翻译研究[J].中国翻译,2006,27(1):59-64. 被引量：47
3冯志伟,徐波,孙茂松.机器翻译的现状和问题[M].科学出版社,2003.
4NagaoM.A. Framework of a mechanical translation betweenJapanese and English by analogy principle [M], North HollandPublications,1984.
5Koehn P,Och F J, Marcu D. Statistical phrase -basedtranslation[J].Association for Computational Linguistics,2003,48-54.
6Och F J, Ney H.A systematic comparison of variousstatistical alignment models[J]. Computational linguistics,2003,29(1):19-51.
7强静,张建.基于短语的统计机器翻译中短语抽取算法改进[J].计算机工程与应用,2008,44(13):147-149. 被引量：3
8侯宏旭,刘群,李锦涛.一种基于短语的汉蒙统计机器翻译与调序模型[J].高技术通讯,2009,19(5):475-479. 被引量：3
9田生伟,吐尔根.依布拉音,禹龙,买合木提.木合买提,艾山.吾买尔.一种维吾尔语句子相似度算法的研究[J].计算机工程与应用,2009,45(26):144-146. 被引量：10
10肖桐,李天宁,陈如山,朱靖波,王会珍.面向统计机器翻译的重对齐方法研究[J].中文信息学报,2010,24(1):110-116. 被引量：5

引证文献2

1陈韵,张鹏华,任利华.机器翻译研究述评[J].价值工程,2013,32(1):174-176. 被引量：3
2张小军,张宇.短语抽取算法在短语统计机器翻译中的应用[J].黑龙江科技信息,2015(27). 被引量：1

二级引证文献4

1郝越.谈“在线网络机器翻译”与“人工翻译”的互补关系[J].河北能源职业技术学院学报,2019,19(2):39-42.
2王晶,赵彩.语义关系下的英语长句机器翻译算法优化[J].信息技术,2021,45(8):102-105. 被引量：9
3Ling Wang,Zhongjian Wang.Japanese-Chinese Machine Translation of Japanese Determiners Based on Templates[J].Modern Electronic Technology,2022,6(1):43-48.
4薛媛,苏依拉,仁庆道尔吉,石宝,李雷孝.基于图卷积编码器的蒙汉神经机器翻译[J].计算机应用与软件,2023,40(10):70-75. 被引量：1

1张贯虹,高玲玲.一种基于统计和模板的双层翻译研究[J].电脑知识与技术,2008,0(11Z):1247-1249.
2杨振东,庞薇,魏玮,杜金华,陈振标,宗成庆.基于短语模板对齐的统计机器翻译系统[J].中文信息学报,2006,20(B03):53-60. 被引量：1
3崔欣.英文辅助晋级XP[J].个人电脑,2001,7(12):52-52.
4e言传情[J].微型计算机,2006(18):167-167.
5团团.图片英文翻译也有道[J].电脑爱好者（普及版）,2010(4):16-16.
6任高举,吐尔根.伊布拉音,艾山.吾买尔.基于短语的统计机器翻译中汉维短语对抽取算法改进[J].现代计算机,2010,16(5):9-11.
7刘春梅,郭岩,俞晓明,赵岭,刘悦,程学旗.针对开源论坛网页的信息抽取研究[J].计算机科学与探索,2017,11(1):114-123. 被引量：11
8张大鲲,张玮,冯元勇,孙乐.基于非连续短语的统计翻译模型研究[J].中文信息学报,2007,21(1):101-108. 被引量：5
9曹春华.最酷输入法，文字、图片、视频都能行！[J].电脑知识与技术（经验技巧）,2013(7):31-31.
10赵宇峰,常燕芳.房地产销售管理系统中的短信平台设计与实现[J].电子设计工程,2012,20(18):57-59.

计算机科学

2009年第10期

浏览历史

内容加载中请稍等...

统计机器翻译中的非连续短语模板抽取及其应用被引量：2

参考文献8

二级参考文献17

共引文献4

同被引文献19

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

统计机器翻译中的非连续短语模板抽取及其应用 被引量：2

参考文献8

二级参考文献17

共引文献4

同被引文献19

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

统计机器翻译中的非连续短语模板抽取及其应用被引量：2