基于多阶段训练的跨语言摘要技术

Cross-Lingual Summarization Technology Based on Multi-stage Training

下载PDF

导出

摘要为解决跨语言摘要(Cross-Lingual Summarization,CLS)模型语义理解、跨语言对齐和文本生成能力不高的问题,提出了一个基于多阶段训练的英-中跨语言摘要模型。首先,进行多语言去噪预训练,同时学习中、英文的通用语言知识;其次,进行多语言机器翻译微调,同时学习对英文的语义理解、从英文到中文的跨语言对齐以及中文的文本生成能力;最后,进行CLS微调,进一步学习特定于CLS任务的语义理解、跨语言对齐和文本生成能力,最终获得一个性能优异的英-中跨语言摘要模型。实验结果表明所提模型的CLS性能有明显提升,且多语言去噪预训练和多语言机器翻译均可提高模型性能。与众多基线模型中的最优性能相比,所提模型在英-中跨语言摘要基准集上将ROUGE-1、ROUGE-2和ROUGE-L值分别提升了45.70%、60.53%和43.57%。 To solve the problem that the models of cross-lingual summarization(CLS)are poor in the semantic understanding,cross-lingual alignment and text generation,this paper proposes a CLS model based on the multi-stage training.Firstly,the model is trained by the multilingual denoising pre-training task,while learning common language knowledge in Chinese and English.Then,the model is trained by the multilingual machine translation task,simultaneously learning the following three types of abilities,semantic understanding of English,cross-lingual alignment from English to Chinese,and text generation of Chinese.Finally,the model is trained by the CLS task,further learning the above three types of abilities,eventually becoming an excellent English-to-Chinese CLS model.The experimental results show that the CLS performance of the proposed model is significantly improved,and the tasks of multilingual denoising pre-training and multilingual machine translation can both improve CLS performance.Experiments on an English-to-Chinese CLS benchmark dataset show that compared to the optimal performance in many baseline models,this model increases ROUGE-1,ROUGE-2 and ROUGE-L by 45.70%,60.53%and 43.57%,respectively.

作者潘航宇席耀一周会娟陈刚郭志刚 PAN Hangyu;XI Yaoyi;ZHOU Huijuan;CHEN Gang;GUO Zhigang(Information Engineering University,Zhengzhou 450001,China)

机构地区信息工程大学

出处《信息工程大学学报》 2024年第2期139-147,共9页 Journal of Information Engineering University

基金国家社会科学基金资助项目(19CXW027)。

关键词跨语言摘要多阶段训练多语言去噪预训练多语言机器翻译 cross-lingual summarization multi-stage training multilingual denoising pre-training multilingual machine translation

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1柳俊,阮彤,张欢欢.基于提示学习的生成式医疗对话理解方法[J].计算机科学,2024,51(5):258-266.
2闻麒,金江涛,李春,岳敏楠.基于多尺度卷积双向长短期记忆网络与注意力机制的滚动轴承剩余寿命预测[J].热能动力工程,2024,39(3):189-199.
3刘润雨,贾路楠.基于分班图神经网络的度不平衡节点分类[J].信息技术与信息化,2024(4):114-117.
4Xianrong Gu,Lidan Guo,Yang Qin,Tingting Yang,Ke Meng,Shunhua Hu,Xiangnan Sun.Challenges and Prospects of Molecular Spintronics[J].Precision Chemistry,2024,2(1):1-13.
5曲棋文,孙烽豪,王佳伟,高健,李辉,吴健.Strong field ionization of molecules on the surface of nanosystems[J].Chinese Physics B,2024,33(4):25-34.

信息工程大学学报

2024年第2期

浏览历史

内容加载中请稍等...

基于多阶段训练的跨语言摘要技术

相关作者

相关机构

相关主题

浏览历史