期刊文献+

一种提高维吾尔语-汉语词语对齐的方法研究 被引量:9

Method to Improve the Result of Uyghur-Chinese Word Alignment
下载PDF
导出
摘要 维吾尔语是典型的粘着性语言,其复杂的形态以及众多的词缀影响维吾尔语-汉语词语对齐的质量.本文提出对维吾尔语词进行形态分析并词干与词缀分离,再进行对齐;并根据维吾尔语遵循语音和谐规律的特点,对维吾尔语词缀的变体采用统一的表示方法,使得词缀呈现相同的形式.通过以上方法欲达到抑制维汉词语对齐中数据稀疏现象.本文利用此方法处理了新疆多语种信息技术重点实验室提供的维汉双语语料,再利用GIZA++进行对齐,试验结果表明,此方法对词语对齐效果起到了明显的积极作用,而且对维汉机器翻译的质量也有显著的提高. Uyghur is an agglutinative language and has vast number of affixes,which has great influence on Uyghur-Chinese word alignment result.To solve this problem,this article proposes a method:represent Uyghur words with their morphological segmentation and use symbolized affixes which classified on phonetic harmony substitute for original forms.After preprocessing with this method,we align Uyghur-Chinese sentences which offered by Xinjiang Multilanguage Key Laboratory with GIZA++.Experimental result shows that this method played an important role on alignment results and improved the performance of translation from Uyghur to Chinese.
出处 《小型微型计算机系统》 CSCD 北大核心 2012年第11期2551-2555,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(60663006)资助 国家自然科学基金重点项目(61032008)资助 国家工信部电子发展基金项目(工信部财(2009)453)资助
关键词 词对齐 维吾尔语 形态分析 GIZA++ word alignment Uyghur language morphological segmentation GIZA++
  • 相关文献

参考文献4

二级参考文献36

  • 1刘小虎,吴葳,李生,赵铁军,蔡萌,鞠英杰.基于词典和统计的语料库词汇级对齐算法[J].情报学报,1997,16(1):21-27. 被引量:8
  • 2Xu Dong-Hua. Aligning and matching of English-Chinese bilingual texts of CNS news. Department of Information System and Computer Science, National Univerisity of Singapore:Technical Report: cmp-lg/9608017, 1996
  • 3Brown P.F., Lai J.C., Mercer R.L. et al.. Aligning sentences in parallel corpora. In: Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, 1991, 169~176
  • 4Gale W.A., Church K.W.. A program for aligning sentences in bilingual corpora. Computational Linguistics, 1993,19(1): 75~102
  • 5Kay M., Roscheisen M.. Text-translation alignment.Computational Linguistics, 1993, 19(1): 121~142
  • 6Chen S.F.. Aligning sentences in bilingual corpora using lexical information. In: Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, OH, 1993, 9~16
  • 7Wu De-Kai. Aligning a parallel English-Chinese corpus statistically with lexical criteria. In: Proceedings of the 32th Annual Conference of the Association for Computational Linguistics, Las Cruces, NM, 1994, 80~87
  • 8Imamura K.. A hierarchical phrase alignment from English and Japanese bilingual text. In: Proceedings of the 2nd International Conference on Intelligent Text Processing and Computational Linguistics, Mexico, 2001, 206~207
  • 9Ker S.J.,Chang J.S.. A class-based approach to word alignment. Computational Linguistics, 1997, 23(2): 313~344
  • 10Borin L.. You'll take the high road and I'll take the low road: Using a third language to improve bilingual word alignment. In: Proceedings of the 18th International Conference of Computational Linguistics, Saarbrucken, Germany,2000, 97~103

共引文献31

同被引文献98

引证文献9

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部