期刊文献+

基于神经网络特征的句子级别译文质量估计 被引量:14

Sentence-Level Machine Translation Quality Estimation Based on Neural Network Features
下载PDF
导出
摘要 机器翻译质量估计是自然语言处理中的一个重要任务,与传统的机器翻译自动评价方法不同,译文质量估计方法评估机器译文的质量不使用人工参考译文.针对目前句子级别机器译文质量估计特征提取严重依赖语言学分析导致泛化能力不足,并且制约着后续支持向量回归算法的性能,提出了利用深度学习中上下文单词预测模型和矩阵分解模型提取句子向量特征,并将其与递归神经网络语言模型特征相结合来提高译文质量自动估计与人工评价的相关性.在WMT15和WMT16译文质量估计子任务数据集上的实验结果表明:利用上下文单词预测模型提取句子向量特征的方法性能统计一致地优于传统的QuEst方法和连续空间语言模型句子向量特征提取方法,这揭示了提出的特征提取方法不仅不需要语言学分析,而且显著地提高了译文质量估计的效果. Machine translation quality estimation is an important task in natural language processing.Unlike the traditional automatic evaluation of machine translation,the quality estimation evaluates the quality of machine translation without human reference.Nowadays,the feature extraction approaches of sentence-level quality estimation depend heavily on linguistic analysis,which leads to the lack of generalization ability and restricts the system performance of the subsequent support vector regression algorithm.In order to solve this problem,we extract sentence embedding features using context-based word prediction model and matrix decomposition model in deep learning,and enrich the features with recurrent neural network language model feature to further improve the correlation between the automatic quality estimation approach and human judgments.The experimental results on the datasets of WMT'15 and WMT'16 machine translation quality estimation subtasks show that the system performance of extracting the sentence embedding features by the context-based word prediction model is better than the traditional QuEst method and the approach that extracts sentence embedding features by the continuous space language model,which reveals that the proposed feature extraction approach can significantly improve the system performance of machine translation quality estimation without linguistic analysis.
出处 《计算机研究与发展》 EI CSCD 北大核心 2017年第8期1804-1812,共9页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61462044 61662031 61462045)~~
关键词 机器翻译质量估计 句子级别 词向量 递归神经网络语言模型 支持向量回归 machine translation quality estimation sentence-level word embedding recurrent neural network language model support vector regression
  • 相关文献

参考文献2

二级参考文献90

  • 1Kishore Papieni,SalimRoukos,Todd Ward,et al.BLUE:a Method for Automatic Evaluation of MachineTranslation[A].ACL 2002[C]:Philadelphia,2002:232-240.
  • 2Coughlin,Deborah.Correlating automated and humanassessments of machine translation quality[A].Pro-ceedings of MT SummitIX[C].New Orleans,2003.
  • 3Yu Shi-Wen.Automatic evaluation of output qualityfor machine translation systems[J].Machine Transla-tion,1993(8):117-126.
  • 4Michael Gamon,Anthony Aue,Martine Smets.Sen-tence-level MT evaluation w ithout reference transla-tions:beyond language modeling[A].Proceedings ofEAMT 2005[C].Budapest,2005.
  • 5Callison-Burch,Chris and Raymond S.FLOURNOY.Aprogram for automatically selecting the best outputfrom multiple machine translation engines[A].Pro-ceedings of MT Summit VIII[C].Santiago de Com-postela,2001:63-66.
  • 6Andreas Stolcke.Srilm-an extensible language model-ing toolkit[A].Speech Technology and Research Lab-oratorySRI International[C].Menlo Park,2002.
  • 7Liu Yang,Sun Jiasong,Wang Zuoying.Comparison ofseveral smoothing methods in statistical languagemodel[A].International Symposium on Chinese SpokenLanguage Processing ISCSLP 2000[C].Beijing,2000.
  • 8宁伟,苗雪雷,胡永华,等.基于SVM的无参考译文的译文质量评测[A].第四届全国机器翻译研讨会[C].北京,2008.
  • 9Miller G A. WordNet: A lexical database for English [J]. Communications of the ACM, 1995, 38(11): 39-41.
  • 10Bollacker K, Evans C, Paritosh P, et al. Freebase: A collaboratively created graph database for structuring human knowledge [C] //Proe of KDD. New York: ACM, 2008: 1247-1250.

共引文献260

同被引文献59

引证文献14

二级引证文献61

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部