期刊文献+

汉维统计机器翻译中的形态学处理 被引量:5

Morphology Processing in Chinese-Uyghur Statistical Machine Translation
下载PDF
导出
摘要 针对汉语和维吾尔语语序差别(前者是主-谓-宾结构,后者是主-宾-谓结构)及形态差别较大的问题,通过编写调序规则将汉语调整为主-宾-谓结构,将维吾尔语单词切分为词干、词缀等更小的词素单元来训练统计模型,同时测试词素的切分粒度对翻译性能的影响。实验结果表明,对汉语句法结构的调整及以词干、词缀等更小的词素形式参与训练可以有效提高翻译质量。 For the large differences of syntactic structure between Chinese and Uyghur, it composes rules to reorder the structure of Chinese sentences to that of Uyghar. For the large morphological differences between Chinese and Uyghur, it splits Uyghur words into stems and affixes, that is, morphemes, to train the statistical model. Meanwhile, it tests the effects of splitting granularities on translation performance. Experimental results show Chinese sentence reordering and splitting Uyghur words into morphemes can effectively improve the performance of translation system.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第12期150-152,共3页 Computer Engineering
基金 中国科学院西部行动计划高新技术基金资助项目(KGCX2-YN-507)
关键词 汉维 统计机器翻译 词素 调序 Chinese-Uyghur statistical machine translation morpheme reordering
  • 相关文献

参考文献8

  • 1Arianna B, Marcello F. Morphological Pre-processing for Turkish to English Statistical Machine Translation[C] //Proc. of IWSLT’09. Tokyo, Japan:[s. n.] , 2009.
  • 2Durgar E K, Oflazer K. Initial Explorations in English to Turkish Statistical Machine Translation[C] //Proc. of IEEE Int’l Conf. on Statistical Machine Translation. New York, USA:[s. n.] , 2006.
  • 3Oflazer K, Durgar E K. Exploring Different Representational Units in English to Statistical Machine Translation[C] //Proc. of the 2nd Workshop on Statistical Machine Translation. Prague, Czech Republic:[s. n.] , 2007.
  • 4Habash N, Sadat F. Arabic Preprocessing Schemes for Statistical Machine Translation[C] //Proc. of the Human Language Technology Conference.[S. l.] : IEEE Press, 2006.
  • 5Zollmann A, Venugopal A, Vogel S. Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation[C] // Proc. of the Human Language Technology Conference. New York, USA:[s. n.] , 2006.
  • 6李国臣, 孟 静. 利用主语和谓语的句法关系识别谓语中心 词[D]. 太原: 山西大学, 2005.
  • 7Mathias C, Krista L. Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0. Publications[EB/OL]. (2005-07-12). http:// www.cis.hut.fi/projects/morpho/.
  • 8董兴华,周俊林,郭树盛,吐尔洪.吾司曼.基于短语的汉维/维汉统计机器翻译[J].计算机工程,2011,37(9):16-18. 被引量:15

二级参考文献7

  • 1Dyer C.Using Word Lattices to Improve Translation from Morphologically Complex Languages[EB/OL].(2007-04-20).http://www.ling.umd.edul-redpony/edinburgh.pdf.
  • 2Koehn P.Europarl:A Parallel Corpus for Statistical Machine Translation[C]//Proc.of the 10th Machine Translation Summit.Phuket,Thailand:[s.n.],2005.
  • 3Creutz M,Lagus K.Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Moffessor1.0[M].Berlin,Germany:Springer-Verlag,2005.
  • 4Koehn P,Och F J,Marcu D.Statistical Phrase-based Translation[C]//Proc.of HLTNAACL'03.Edmonton,Canada:[s.n.],2003:48-54.
  • 5杨攀,张建,李淼,乌达巴拉,雪艳.汉蒙统计机器翻译中的形态学方法研究[J].中文信息学报,2009,23(1):50-57. 被引量:10
  • 6艾山.吾买尔,吐尔根.依步拉音.基于最大熵的维吾尔语句子边界识别模型[J].计算机工程,2010,36(6):24-26. 被引量:7
  • 7米尔夏提.力提甫,米吉提.阿布力米提.汉维机器翻译中维语动词的处理方法[J].新疆大学学报(自然科学版),2004,21(1):77-80. 被引量:2

共引文献14

同被引文献56

引证文献5

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部