摘要
译文质量估计作为机器翻译中的一项重要任务,在机器翻译的发展和应用中发挥着重要的作用。该文提出了一种简单有效的基于Transformer的联合模型用于译文质量估计。该模型由Transformer瓶颈层和双向长短时记忆网络组成,Transformer瓶颈层参数利用双语平行语料进行初步优化,模型所有参数利用译文质量估计语料进行联合优化和微调。测试时,将待评估的机器译文使用强制学习和特殊遮挡与源语言句子一起输入联合神经网络模型以预测译文的质量。在CWMT18译文质量估计评测任务数据集上的实验结果表明,该模型显著优于在相同规模训练语料下的对比模型,和在超大规模双语语料下的最优对比模型性能相当。
As an important task in machine translation,quality estimation of machine translation plays an important role in the development and application of machine translation.In the paper,we propose a simple and effective unified model base on Transformer for quality estimation of machine translation.The model is composed of the Transformer bottleneck layer and a Bi-LSTM network.Parameters of Transformer bottleneck layer are preliminarily optimized with bilingual parallel corpus,and all parameters of the model are jointly optimized and fine-tuned with the training dataset of quality estimation.In model testing,the translation outputs to be estimated are dealt with teacher forcing and a special masking,and then input into the unified model along with the source sentences.The experimental results on the datasets form CWMT18 quality estimation task show that the proposed model is significantly superior to the baseline models trained on the same data,and comparable with that of the best baseline model trained on the large scale bilingual corpus.
作者
陈聪
李茂西
罗琪
CHEN Cong;LI Maoxi;LUO Qi(School of Computer and Information Engineering,Jiangxi Normal University,Nanchang,Jiangxi 330022,China)
出处
《中文信息学报》
CSCD
北大核心
2021年第6期47-54,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金(61662031,61462044)。