期刊文献+

融合图像信息的越汉跨语言新闻文本摘要方法

Cross-lingual Vietnamese-Chinese news text summarization method with image fusion
下载PDF
导出
摘要 [目的]为了有效剔除冗余文本信息,提高摘要简洁性同时充分利用图像信息提高摘要准确性,对融合图像信息的越汉跨语言新闻文本摘要方法进行研究.[方法]首先利用文本编码器和图像编码器对越南语新闻文本和图像进行表征,其次利用图文对比损失增强图像和文本表征的一致性,迫使越南语的表征空间趋近于与语言无关的图像表征空间,然后利用图文融合器进行图像和文本的有效融合,增强新闻文本的关键信息提取能力,最后利用摘要解码器生成中文摘要.[结果]在本文构建的越汉多模态跨语言摘要数据集上,相较于对比方法,本方法生成的摘要具备更高的ROUGE分数、信息量、简洁度和流畅度.[结论]引入图像信息有利于生成高质量的跨语言摘要;采用单任务直接学习两种语言的互动信息可以降低将跨语言摘要分解为多任务带来的误差累积. [Objective]The Vietnamese-Chinese cross-language news summarization task aims to convert Vietnamese news into Chinese summaries in a concise,accurate and readable form.The existing Vietnamese-Chinese cross-language news summarization task mainly focuses on the summary and extraction of text information.Although,to a certain extent,it improves the accuracy of generated summaries,it ignores the importance of images in news reports.[Methods]Therefore,this paper proposes a Vietnamese-Chinese cross-language news text summarization method that integrates image information,and explores how to effectively use image information to solve related problems.Due to the lack of image-text cross-language summary datasets,this paper constructs a real dataset of 142000 news data sample pairs and 235770 news images on multiple Vietnamese news websites.First,the Vietnamese news text and image are represented using a text encoder and an image encoder.Second,the image-text contrast loss is used to enhance the consistency of image and text representation,forcing the Vietnamese representation space to approach the language-independent image representation space.Third,the image-text fuser is used to effectively fuse images and texts,enhancing the ability to extract key information from news texts.Finally,the summary decoder is used to generate a Chinese summary.[Results]To demonstrate the effectiveness of the Vietnamese-Chinese cross-language summary method that fuses image information,we compare the performance of this method with those of six other baseline methods on the data set constructed in this article.First,experimental results show that this model has significantly improved compared to the traditional cross-language summary model.Second,comparison results with multiple end-to-end cross-language summary models NCLS,indicating that the integration of image information can effectively improve cross-language summary performance.This article also explores the impact of ablation experiments on model performance.The experimental results show that the model performance dropped significantly after removing the image encoding module and the image-text fusion module.After removing the image-text contrast loss module,the model performance dropped and randomly.Selecting an image and replacing it with an image synthesized by Gaussian noise reduced model performance.In addition,this article also adds the hyper-parameter experimental analysis to further explore the important impact of the proportional relationship between the number of text encoding layers and the number of graphic encoding layers on the performance of the overall model.The experimental results show that when 3 layers are text encoders,and 3 layers are image and text encoders,the ROUGE score is highest.Finally,the manual evaluation experimental analysis is added to demonstrate the authenticity of the summary generated by this model.Experimental results show that the information content score,conciseness score and fluency score of MH-CLS perform more satisfactorily than those of models Sum-Trans,Trans-Sum and MCLAS do,thus further suggesting the effectiveness of the method.[Conclusions]The proposed Vietnamese-Chinese cross-language news text summarization method that fuses image information has achieved significant improvements compared with existing cross-language summarization methods.Analysis of the experimental results shows that the addition of image information and image-text comparison modules to guide the generated summary plays an important role in improving the quality of cross-language news summaries;the synergy of images and text is fully utilized in terms of image-text fusion and key information extraction.It can better extract key information and achieve satisfactory results in terms of summary information volume,accuracy and information richness.Such advantages clearly demonstrate the vital role of images in cross-language summarization and show that our approach can effectively use image information to improve both the quality and understandability of summaries.
作者 吴奇远 余正涛 黄于欣 谭凯文 张勇丙 WU Qiyuan;YU Zhengtao;HUANG Yuxin;TAN Kaiwen;ZHANG Yongbing(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Key Laboratory of Artificial Intelligence of Yunnan Province,Kunming 650500,China;Yunnan International Joint Laboratory of South Asia and Southeast Asia Languages Machine Translation and Application,Kunming University of Science and Technology,Kunming 650500,China;Engineering Research Center of South Asia and Southeast Asia Languages Voice Information Processing,Ministry of Education,Kunming University of Science and Technology,Kunming 650500,China)
出处 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第4期714-723,共10页 Journal of Xiamen University:Natural Science
基金 国家自然科学基金(U21B2027,61972186,62266027,62266028) 云南省重大科技专项计划项目(202302AD08003,202202AD080003) 云南省基础研究计划项目(202301AT070393,202301AT070471) 昆明理工大学“双一流”共建项目(202201BE070001-021)。
关键词 跨语言摘要 越汉跨语言新闻摘要 图文融合 图文对比损失 cross-lingual summarization Vietnamese-Chinese cross-lingual news summarization text-image fusion text-image contrastive loss
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部