期刊文献+

维吾尔语名词词尾对维汉词对齐的影响研究

Research on the Effect of Noun Suffix in Uyghur Language to Uyghur-Chinese Word Alignment
下载PDF
导出
摘要 维吾尔语丰富而复杂的形态结构往往对维汉词对齐产生不良影响.如果将词尾丢弃只保留词干,虽然可以解决数据稀疏问题,但同时丢掉词尾中很多有意义的信息.为此,对词尾采用统一化形式并保留词尾是解决以上问题的方法之一,而这方法又带来句子长度过长的问题.针对以上问题,通过分析维汉两种语言的语法范畴的特点,提出选择性的保留词尾的分离—丢弃方案,并将此方案应用到维吾尔语名词上.实验数据表明,本文提出的方案不仅可行而且对提高词对齐正确率以及机器翻译质量起到了积极作用. As a typical agglutinative language, the rich and complex morphological structure of Uyghur language has adverse effect on Uyghur-Chinese word alignment. It will be good methods that dropping all suffix and leave roots only, but it will cause lost most of useful information that suffix has. To solving this problem, we can use the method that unified suffix form for variants and do not drop it. However, it will cause another problem that the length of sentences will get longer. In this paper, we proposed splitting dropping scheme that leaving suffix selectively to solving these problems. After using this scheme on Noun in Uyghur language, the experiment results shows this method plays important role on improving Uyghur-Chinese word alignment and machine translation.
出处 《新疆大学学报(自然科学版)》 CAS 北大核心 2015年第4期469-474,共6页 Journal of Xinjiang University(Natural Science Edition)
基金 国家自然科学基金资助项目(61262061) 新疆维吾尔自治区科技计划项目(201423120)
关键词 词对齐 机器翻译 维吾尔语名词 维吾尔语 alignment machine translation Noun in Uyghur language Uyghur language
  • 相关文献

参考文献15

  • 1Oflazer K.Statistical machine translation into a morphologically complex language[J].Computational Linguistics and Intellgent,2008,4919:376-387.
  • 2Oflazer K.Exploring different representational units in English-to-Turkish statistical machine translation[C].Second Workshop on Statistical Machine Translation USA,2007:25-32.
  • 3麦热哈巴·艾力,王志洋,吐尔根·依布拉音.一种提高维吾尔语-汉语词语对齐的方法研究[J].小型微型计算机系统,2012,33(11):2551-2555. 被引量:9
  • 4Och F J.Minimum error rate training in statistical machine translation[C].In Proceedings of the ACL,2003,1001:160-167.
  • 5Papineni K,Roukos S Ward T,et al.BLEU:A method for automatic evaluation of machine translation[C].In Proceeding ACL’02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics USA,2002:311-318.
  • 6Stolcke A.SRILM-an extensible language modeling toolkit[C].In Proceedings of the International Conference on Spoken Language Processing USA,2002:901-904.
  • 7Wang Z,Lu Y,Liu Q.Multi-granularity word alignment and decoding for agglutinative language translation[C].In Proceedings of MT SUMMIT,2011:360-367.
  • 8Habash N,Sadat F.Arabic preprocessing schemes for statistical machine translation[C].In Proceedings of the Human Language Technology Conference of the NAACL,New York City,2006:49-52.
  • 9Bisazza A,Federico M.Morphological pre-processing for Turkish to English statistical machine translation[C].In Proceedings of Workshop on Spoken Language Translation,2009:129-135.
  • 10力提甫.托乎提.电脑处理维吾尔语语音和谐律的可能性[J].中央民族大学学报(哲学社会科学版),2004,31(5):108-113. 被引量:14

二级参考文献26

  • 1邹修明,祝志杰.双语句子对齐系统中多层次分段对齐方法研究[J].淮阴师范学院学报(自然科学版),2002,1(1):32-35. 被引量:1
  • 2吕学强,吴宏林,姚天顺.无双语词典的英汉词对齐[J].计算机学报,2004,27(8):1036-1045. 被引量:11
  • 3力提甫.托乎提.电脑处理维吾尔语语音和谐律的可能性[J].中央民族大学学报(哲学社会科学版),2004,31(5):108-113. 被引量:14
  • 4张孝飞,陈肇雄,黄河燕,王建德.基于锚点词对的双语词对齐算法[J].小型微型计算机系统,2006,27(2):330-334. 被引量:10
  • 5董振东,董强,郝长伶.知网的理论发现[J].中文信息学报,2007,21(4):3-9. 被引量:99
  • 6Yang LIU, Qun LIU, and Shouxun LIN. Log-linear Models for Word Alignment[C]. Morristown, NJ, USA: The 43rd Annual Meeting of Association of Computational Linguistics (ACL-05). Publisher Association for Computational Linguistics, 2005: 25-30.
  • 7Wang Haifeng, Wu Hua, Liu Zhanyi. Word alignment for languages with scarce resources using bilingual corpora of other language pairs [C]. Morristown, NJ, USA: Proceedings of the COLING/ACL on Main Conference Poster Sessions Table of Contents.Publisher Association for Computational Linguistics, 2006:874-881.
  • 8Phil BI e word alignment with conditional random fields[C]. Morristown, N J, USA: Proceedings of the 21 st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL Table of Contents. Publisher Association for Computational Linguistics,2006:65-67.
  • 9Dan Tufis,Radu Ion,Alexandru Ceausu, et al.Combined word alignments[C].Morristown,NJ,USA:Proc of the ACL-2005 Workshop on Building and Using Parallel Texts:Data-driven Machine Translation and Beyond, Publisher Association for Computational Linguistics, 2005:107-110.
  • 10Shankar Kumar, Franz Och,Wolfgang Macherey.lmproving word aligmnent with bridge languages [C]. Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007: 42-50.

共引文献47

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部