期刊文献+

基于特征比较和最大熵模型的统计机器翻译错误检测

Error Detection for Statistical Machine Translation Based on Feature Comparison and Maximum Entropy Model Classifier
下载PDF
导出
摘要 首先介绍3种典型的用于翻译错误检测和分类的单词后验概率特征,即基于固定位置的词后验概率、基于滑动窗的词后验概率和基于词对齐的词后验概率,分析其对错误检测性能的影响;然后,将其分别与语言学特征如词性、词及由LG句法分析器抽取的句法特征等进行组合,利用最大熵分类器预测翻译错误,并在汉英NIST数据集上进行实验验证和比较。实验结果表明,不同的单词后验概率对分类错误率的影响是显著的,并且在词后验概率基础上加入语言学特征的组合特征可以显著降低分类错误率,提高译文错误预测性能。 The authors firstly introduce three typical word posterior probabilities (WPP) for error detection and classification, which are fixed position WPP, sliding window WPP, and alignment-based WPP, and analyzes their impact on the detection performance. Then each WPP feature is combined with three linguistic features (Word, POS and LG Parsing knowledge) over the maximum entropy classifier to predict the translation errors. Experimental results on Chinese-to-English NIST datasets show that the influences of different WPP features on the classification error rate (CER) are significant, and the combination of WPP with linguistic features can significantly reduce the CER and improve the prediction capability of the classifier.
作者 杜金华 王莎
出处 《北京大学学报(自然科学版)》 EI CAS CSCD 北大核心 2013年第1期81-87,共7页 Acta Scientiarum Naturalium Universitatis Pekinensis
基金 国家自然科学基金(61100085) 陕西省教育厅专项科研计划项目(11JK1029) 西安理工大学青年科技研究计划项目(105211017)资助
关键词 错误检测 词后验概率 语言学特征 最大熵分类器 error detection word posterior probability linguistic features maximum entropy classifier
  • 相关文献

参考文献14

  • 1Yamada K, Knight K. A syntax-based statistical translation model//Proceedings of ACL-EACL. Tou- louse: Morgan Kaufmann, 2001:523-530.
  • 2Koehn P, Och F J, Marcu D. Statistical phrase-based translation//Proceedings of HLT-NAACL. Edmonton: Association for Computational Linguistics, 2003: 127-133.
  • 3Chiang D. A hierarchical phrase-based model for statistical machine translation//Proceedings of ACL. Ann Arbor: Association of Computational Linguistics, 2005:263-270.
  • 4Gandrabur S, Foster G. Confidence estimation for translation prediction//Proceedings of HLT-NAACL. Sapporo: Association for Computational Linguistics,2003:95-102.
  • 5Ueffing N, Macherey K, Ney H. Confidence measures for statistical machine translation // Proceedings of MT Summit IX. New Orleans: Springer-Verlag, 2003: 394-401.
  • 6Blatz J, Fitzgerald E, estimation for machine COLING. Geneva: Yale 321 Foster G, et al. Confidence translation // Proceedings of University Press, 2004: 315-.
  • 7Ueffing N, Ney H. Word-Level confidence estimation for machine translation. Computational Linguistics, 2007, 33(1): 9-40.
  • 8Specia L, Cancedda N, Dymetman M, et al. Estimating the sentence-level quality of machine translation systems//Proceedings of the 13th EAMT. Barcelona: European Association for Machine Translation, 2009:28-35.
  • 9Speeia L, Saunders C, Turchi M, et al. Improving the confidence of machine translation quality estimates// Proceedings of the 12th MT Summit. Ottawa: Inter- national Association for Machine Translation, 2009: 136-143.
  • 10Xiong Deyi, Zhang Min, Li Haizhou. Error detection for statistical machine translation using linguistic features // Proceedings of the 48th ACL. Uppsala: Association for Computational Linguistics, 2010: 604-611.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部