摘要
多模态生成式摘要往往采用序列到序列(Seq2Seq)框架,目标函数在字符级别优化模型,根据局部最优解生成单词,忽略了摘要样本全局语义信息,使得摘要与多模态信息产生语义偏差,容易造成事实性错误。针对上述问题,提出一种基于语义相关性分析的多模态摘要模型。首先,在Seq2Seq框架基础上对多模态摘要进行训练,生成语义多样性的候选摘要;其次,构建基于语义相关性分析的摘要评估器,从全局的角度学习候选摘要之间的语义差异性和真实评价指标ROUGE(Recall-Oriented Understudy for Gisting Evaluation)的排序模式,从而在摘要样本层面优化模型;最后,不依赖参考摘要,利用摘要评估器对候选摘要进行评价,使得选出的摘要与源文本在语义空间中尽可能相似。实验结果表明,在公开数据集MMSS上,相较于MPMSE(Multimodal Pointer-generator via Multimodal Selective Encoding)模型,所提模型在ROUGE-1、ROUGE-2、ROUGE-L评价指标上分别提升了3.17、1.21和2.24个百分点。
Multi-modal abstractive summarization is commonly based on the Sequence-to-Sequence(Seq2Seq)framework,and the objective function optimizes the model at the character level,which searches locally optimal results to generate words and ignores the global semantic information of the summary samples.It may cause a problem of semantic deviation between the summary and multimodal information,resulting in factual errors.In order to solve the above problems,a multi-modal summarization model based on semantic relevance analysis was proposed.Firstly,the summary generator based on Seq2Seq framework was trained to generate candidate summaries with semantic multiplicity.Secondly,a summary evaluator based on semantic relevance analysis was applied to learn the semantic differences among candidate summaries and the evaluation mode of ROUGE(Recall-Oriented Understudy for Gisting Evaluation)from a global perspective,so that the model could be optimized at the level of summary samples.Finally,the summary evaluator was used to carry out reference-free evaluation of the candidate summaries,making the finally selected summary sample as similar as possible to the source text in semantic space.Experiments on benchmark dataset MMSS show that the proposed model can improve the evaluation indexes of ROUGE-1,ROUGE-2 and ROUGE-L by 3.17,1.21 and 2.24 percentage points respectively compared with the current optimal MPMSE(Multimodal Pointer-generator via Multimodal Selective Encoding)model.
作者
林于翔
吴运兵
阴爱英
廖祥文
LIN Yuxiang;WU Yunbing;YIN Aiying;LIAO Xiangwen(College of Computer and Data Science,Fuzhou University,Fuzhou Fujian 350108,China;Digital Fujian Institute of Financial Big Data,Fuzhou Fujian 350108,China;Department of Computer Engineering,Zhicheng College of Fuzhou University,Fuzhou Fujian 350002,China)
出处
《计算机应用》
CSCD
北大核心
2024年第1期65-72,共8页
journal of Computer Applications
基金
国家自然科学基金资助项目(61976054)
福建省自然科学基金资助项目(2022J01116)。
关键词
多模态
生成式摘要
序列到序列
事实性错误
语义相关性
multi-modal
abstractive summarization
Sequence-to-Sequence(Seq2Seq)
factual error
semantic relevance