期刊文献+

融合特征约束模型的纳西-汉语双语词语对齐算法 被引量:2

A Bilingual Word Alignment Algorithm of Naxi-Chinese Based on Feature Constraint Models
下载PDF
导出
摘要 针对纳西语、汉语因句法结构差异较大而导致双语词语自动对齐较为困难的问题,提出一种融合特征约束模型的纳西-汉语双语词语对齐算法.首先在语料中统计纳西-汉语词语区间扭曲和位置转换特性,并由此建立2个双语词语对齐的特征约束模型;然后将提出的特征约束模型融入词语对齐的对数线性模型框架,并结合最小错误率算法训练模型参数;最终搜索出最佳的词语对齐结果.实验以IBM Model3为词语对齐比较模型,结果表明,该双语词语对齐算法可以使纳西-汉语词语的对齐准确率提升21.9%. A bilingual word alignment algorithm of Naxi-Chinese based on feature constraint models is proposed to reduce the difficulty of bilingual word alignment for Naxi-Chinese which has huge difference in syntactic structure. Two feature constraint models- interval distortion model and position transformation model are established by counting the traits of interval distortion and position transformation in corpus, and are integrated into a log-linear framework of word alignment. Then parameters in the models are trained using the minimum error rate algorithm and the best alignment results are eventually searched. Experimental results on IBM Model3 show that the proposed algorithm increases the word alignment accuracy of Naxi-Chinese about 21.9%.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2011年第10期48-53,共6页 Journal of Xi'an Jiaotong University
基金 国家自然科学基金资助项目(60863011) 云南自然科学基金重点资助项目(2008CC023)
关键词 词语对齐 纳西 汉语 特征约束模型 word alignment Naxi Chinese feature constraint model
  • 相关文献

参考文献11

  • 1BROWN P F, PIETRA D V J, PIETRA D S A, et al. The mathematics of statistical machine translation: pa- rameter estimation[J].Computational Linguistics, 1993, 19(2):263-311.
  • 2VOGEL S, NEY H, TILLMANN C. HMM-based word alignment in statistical translation[C]//Proceed- ings of the 16th International Conference on Computa- tional Linguistics. Stroudsburg, PA, USA: Associa- tion for Computational Linguistics, 1996 : 836-841.
  • 3TASKAR B, LACOSTE-JULIEN S, KLEIN D. A discriminative matching approach to word alignment [C]//Proceedings of the Conference on Human Lan- guage Technology and Empirical Methods in Natural Language Poreessing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2005: 73-80.
  • 4MOORE IL A discriminative framework for bilingual word alignment [C] /// Proceedings of the Conference on Human Language Technology and Empirical Meth- ods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2005 : 81-88.
  • 5CHERRY C, LIND. A probability model to improve word alignment[C] /// Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computa- tional Linguistics, 2003 : 88-95.
  • 6LIU Yang, LIU Qun, LIN Shouxun. Log-linear mod- els for word alignment[C]//Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA. Association for Computational Linguistics, 2005: 459-466.
  • 7LIU Yang, LIU Qun, LIN Shouxun. Discriminative word alignment by linear modeling [J~. Computational Linguistics, 2010, 36(3):303-339.
  • 8AYAN N F, DORR B J. A maximum entropy ap- proach to combining word alignments [C] ff Proceed- ings of the Human Language Technology Conference of the North American Chapter of the ACL. Strouds- burg, PA, USA: Association for Computational Lin- guistics, 2006: 96-103.
  • 9OCH F J, NAY H. Discriminative training and maxi- mum entropy models for statistical machine translation [C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2002: 295-302.
  • 10TOUTANOVA K, TOLAG I H, MANNING C D. Extensions to HMM-based statistical word alignment models [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:Association for Computational Linguistics, 2002. 87-94.

同被引文献22

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部