期刊文献+

基于句法调序的汉维统计机器翻译 被引量:2

Chinese-Uyghur Statistical Machine Translation Based on Syntactical Reordering
下载PDF
导出
摘要 在汉语到维语的统计机器翻译中,2种语言在形态学及语序上差异较大,导致未知词较多,且产生的维语译文语序混乱。针对上述问题,在对汉语和维语的语序进行研究的基础上,提出一种汉语句法调序方法,进而对维语进行形态学分析,采用基于因素的统计机器翻译系统进行验证。实验结果证明,该方法在性能上较基线系统有显著改进,BLEU评分由15.72提高到19.17。 Chinese and Uyghur are very different in terms of morphological typology and word order, which leads to many unknown words and confusion word order in Uyghur when translate from Chinese to Uyghur using statistical method. On the basis of the word order of Chinese and Uyghur, a Chinese syntactic reordering method is proposed, and an analysis on Uyghur morphological information is made to resolve the difficulties. Experimental results on the factor-based SMT show that the approach achieves a substantial improvement in translation quality over the baseline phrase-based system, and the BLEU score is improved from 15.72 to 19.17.
出处 《计算机工程》 CAS CSCD 2012年第3期169-171,175,共4页 Computer Engineering
基金 中国科学院西部行动计划高新技术基金资助项目(KGCX2-YN-507)
关键词 统计机器翻译 句法调序 形态学 因素模型 翻译模型 Statistical Machine Translation(SMT) syntactical reordering morphological factored model translation model
  • 相关文献

参考文献6

  • 1Koehn P, Och F J, Marcu D. Statistical Phrase-based Translation[C] // Proc. of Conference for Computational Linguistics on Human Language. Stroudsburg, USA: [s. n.] , 2003: 127-133.
  • 2Elming J. Syntactic Reordering Integrated with Phrase-based SMT[C] //Proc. of the 22nd International Conference on Computational Linguistics. Manchester, UK: [s. n.] , 2008: 209- 216.
  • 3Zollmann A, Venugopal A, Vogel S. Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation[C] // Proc. of North American Chapter of the Association for Computational Linguistics. New York, USA: [s. n.] , 2006: 201-204.
  • 4董兴华,周俊林,郭树盛,吐尔洪.吾司曼.基于短语的汉维/维汉统计机器翻译[J].计算机工程,2011,37(9):16-18. 被引量:15
  • 5Xue Nianwen, Xia Fei. The Bracketing Guidelines for the Penn Chinese Treebank(3.0)[EB/OL]. (2000-11-12). http://www.cis. upenn.edu/~chinese.
  • 6Li Jinji, Kim Dong-Il, Lee Jong-Hyeok. Annotation Guidelines for Chinese-Korean Word Alignment[EB/OL]. (2008-05-05). http:// www.mt-archive.info/LREC-2008-Li.pdf.

二级参考文献7

  • 1Dyer C.Using Word Lattices to Improve Translation from Morphologically Complex Languages[EB/OL].(2007-04-20).http://www.ling.umd.edul-redpony/edinburgh.pdf.
  • 2Koehn P.Europarl:A Parallel Corpus for Statistical Machine Translation[C]//Proc.of the 10th Machine Translation Summit.Phuket,Thailand:[s.n.],2005.
  • 3Creutz M,Lagus K.Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Moffessor1.0[M].Berlin,Germany:Springer-Verlag,2005.
  • 4Koehn P,Och F J,Marcu D.Statistical Phrase-based Translation[C]//Proc.of HLTNAACL'03.Edmonton,Canada:[s.n.],2003:48-54.
  • 5杨攀,张建,李淼,乌达巴拉,雪艳.汉蒙统计机器翻译中的形态学方法研究[J].中文信息学报,2009,23(1):50-57. 被引量:10
  • 6艾山.吾买尔,吐尔根.依步拉音.基于最大熵的维吾尔语句子边界识别模型[J].计算机工程,2010,36(6):24-26. 被引量:7
  • 7米尔夏提.力提甫,米吉提.阿布力米提.汉维机器翻译中维语动词的处理方法[J].新疆大学学报(自然科学版),2004,21(1):77-80. 被引量:2

共引文献14

同被引文献25

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部