摘要
本文主要介绍一套人工翻译质量自动评估特征集。该特征集包含单语、双语、语言模型三类翻译质量指标特征,使用该特征集和机器学习方法构建的自动评分系统可从内容充分性和语言流畅性两个方面对人工翻译进行质量预测。基于支持/相关向量机学习算法,研究将此特征集与QuEst基线集进行对比,并尝试使用模拟退火算法从特征集中选取部分对模型预测作用更有价值的特征,进行二次建模。结果表明,该特征集对翻译流畅性的预测优于基线特征集,二者对译文充分性的预测无显著差别;经过特征筛选后的评分模型对译文流畅性的预测作用显著提高;特征集系统和基线系统对译文充分性预测均优于对流畅性的预测。
We introduce a feature set for automated human translation quality estimation(AHTQE).This set comprises translation quality indicators of monolingual,bilingual and language model(LM)features,on which machine learning techniques can be employed to build AHTQE systems to predict translation qualities in terms of content adequacy and language fluency.We compare the feature set with the QuEst baseline set,using them in models trained with support vector machine(SVM)and relevance vector machine(RVM)on the same data set.We also report an experiment on feature selection with simulated annealing(SA)algorithm to opt for fewer but more contributing features from the whole set.Our experiments show that models trained on our feature set perform consistently better in predicting the fluency than the models trained on the baseline feature set,but there is no significant difference found among them for predicting adequacy.Through feature selection,our scoring model significantly improves to predict fluency.Both the baseline set and our feature set perform better in estimating translation adequacy than in predicting translation fluency.
出处
《外语教学与研究》
CSSCI
北大核心
2016年第5期776-787,801,共12页
Foreign Language Teaching and Research
基金
国家建设高水平大学公派研究生项目(留金发[2013]3009号)资助