摘要
机器译文自动评价对推动机器翻译发展和应用有着重要作用。最新的神经机器译文自动评价方法使用预训练语境词向量提取深层语义特征,并将它们直接拼接输入多层神经网络预测译文质量,其中直接拼接操作容易导致特征间缺乏深入融合,而逐层抽象进行预测时容易丢失细粒度准确匹配信息。针对以上问题,该文提出将中期信息融合方法和后期信息融合方法引入译文自动评价,使用拥抱融合对不同特征进行交互中期融合,基于细粒度准确匹配的句移距离和句级余弦相似度进行后期融合。在WMT’21 Metrics Task基准数据集上的实验结果表明,提出的方法能有效提高其与人工评价的相关性,达到与参加评测最优系统的可比性能。
Machine translation evaluation plays an important role in promoting the development and application of machine translation.The latest neural methods of evaluating machine translation use pretrained contextual embeddings to extract different deep semantic features,and then simply concatenate them feed into the multi-layer neural network to predict translation quality.We propose to introduce middle stage information fusion and late stage information fusion into evaluation of machine translation.More specifically,we propose to use embrace fusion to interactively fuse different features in the middle stage.In the late stage,we fuse sentence mover’s distance and sentence cosine similarity based on fine-grained accurate matching.Experimental results on the WMT'21Metrics Task show that the proposed method can achieve competitive performance with the best metrics in the evaluation campaign.
作者
刘媛
李茂西
项青宇
李易函
LIU Yuan;LI Maoxi;XIANG Qingyu;LI Yihan(School of Computer and Information Engineering,Jiangxi Normal University,Nanchang,Jiangxi 330022,China;Management Science and Engineering,Jiangxi Normal University,Nanchang,Jiangxi 330022,China;College of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing,Jiangsu 210000,China)
出处
《中文信息学报》
CSCD
北大核心
2023年第3期89-100,共12页
Journal of Chinese Information Processing
基金
国家自然科学基金(61662031,61462044)
江西省教育厅科技项目(GJJ210306)
教育部产学合作协同育人项目(220604647062739)
关键词
机器翻译
译文自动评价
信息融合
信息表征
拥抱融合
machine translation
automatic evaluation of machine translation
information fusion
information representation
embrace fusion