期刊文献+

调序规则表的深度过滤研究 被引量:4

Research of Deep Filtering Lexical Reordering Table
下载PDF
导出
摘要 机器翻译系统中调序规则表和翻译表一般规模都很大,对翻译表进行优化过滤一直都是研究热点,而过滤调序规则表的研究却近乎空白。将调序规则表的过滤当成短文本分类问题,提出了一种基于自动编码机(Autoencoder)的调序规则表过滤模型。该模型首先使用一种基于自动编码机的分类器对调序规则进行打分评价,然后对调序规则表进行基于最小差异策略的过滤,最后使用过滤得到的调序规则表重新计算调序规则得分表用于机器翻译的解码过程。实验表明,在公开的英汉语料和维汉语料上使用该模型,可以在调序规则表减少40%的基础上分别将BLEU(bilingual evaluation understudy)值提高0.19和0.26。 In statistical machine translation system,lexical reordering table and phrase-table are always huge.Tuning and filtering the phrase-table has been research focus long time,while few researchers focus on filtering the lexical reordering table.This paper treats filtering lexical reordering table as the problem of short text classification,proposes a filtering model of lexical reordering table based on Autoencoder.This model uses the Autoencoder to score the reordering rules firstly,then filters the lexical reordering table by minimal difference strategy,finally recalculates lexical reordering score table used for machine translation decoding.The experimental results show that the size of lexical reordering table reduces40%while the BLEU(bilingual evaluation understudy)increases0.19and0.26by using the proposed model on public English-Chinese corpus and Uyghur-Chinese corpus.
作者 孔金英 李晓 王磊 杨雅婷 罗延根 KONG Jinying;LI Xiao;WANG Lei;YANG Yating;LUO Yangen(Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China;Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China;University of Chinese Academy of Sciences, Beijing 100049, China)
出处 《计算机科学与探索》 CSCD 北大核心 2017年第5期785-793,共9页 Journal of Frontiers of Computer Science and Technology
基金 国家高技术研究发展计划(863计划)No.2013AA01A607 中国科学院战略性先导科技专项课题No.XDA06030400 中国科学院"西部之光"项目Nos.XBBS201216 LHXZ201301~~
关键词 自动编码机 过滤模型 调序规则表 机器翻译 Autoencoder filtering model lexical reordering table machine translation
  • 相关文献

参考文献3

二级参考文献39

  • 1周玉,宗成庆,徐波.基于多层过滤的统计机器翻译[J].中文信息学报,2005,19(3):54-60. 被引量:3
  • 2Brown P F.The Mathematics of Statistical Machine Translation:Parameter Estimation[J].Computational Linguistics,1993,19(2):263-311.
  • 3Frantzi K,Ananiadou S,Tsuji J.The C-value/NC-value Method of Automatie Recognition for Multi-Word Terms[C] //Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries.Springer-Verlag,1998.
  • 4Franz Josef Och,Hermann Ney.Discriminative Training and Maximum Entropy Models for Statistical Machine Translation[C].ACL,2002.
  • 5Franz Josef Och.Minimum Error Rate Training for Statistical Machine Translation[C] //Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL),Japan,Sapporo,July 2003.
  • 6Cenugopal A,Vogel S,Vaibel A.Effective phrase translation extraction from alignment models[C] //Proceedings of the 1st Annual Meeting of the Association of Computational Linguistics (ACL),2003.
  • 7David Chiang. A hierarchical phrase-based model for statistical machine translation [C]//Proceedings of the 43rd Annual Meeting of the Association for Computa- tional Linguistics. 2005.. 263-270.
  • 8David Chiang. Hierarchical phrase-based translation [J]. Computational Linguistics. 2007, 33(2) : 201-228.
  • 9Philipp Koehn, Franz Joseph Och, Daniel Marcu. Sta- tistical Phrase-Based Translation [C]//Proeeedings of NAACL 2003. 2003.
  • 10Christoph Tillman. A unigram orientation model for statistical maeh[ne translation [C]//Proeeedings of HLT-NAACL 2004: Short Papers. 2004: 101-104.

共引文献5

同被引文献26

引证文献4

二级引证文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部