一种全局信息增强的语义图像合成方法

A Method for Semantic Image Synthesis with Global Information Enhancement

下载PDF

导出

摘要语义图像合成是图像翻译领域一个重要的研究和应用方向,旨在利用输入的语义图像(如语义分割图、地图、线稿图等)生成与图像描述相符的真实图像。针对基于生成对抗网络(GAN)的语义图像合成任务由于缺乏全局信息导致生成图像特征模糊、纹理细节缺乏关联性的问题,基于pix2pix网络提出一种结合外部注意力机制改进的全局信息增强语义图像合成方法。首先,在U-net结构的生成器上采样阶段引入外部注意力机制,增强生成图像像素间的空间相关性;其次,在生成器上采样层使用深度残差模块,在提高生成图像质量的同时,增强生成图像的多样性;最后,鉴别器引入全局信息以增强鉴别能力。在cityscape、landscape、edges2shoes数据集上进行实验,改进后的方法相比基线模型在FID指标上分别提升了57.37、26.74和1.78。实验结果表明,该模型能够有效利用全局信息增强生成图像纹理细节的关联性,提高图像质量。 Semantic image synthesis is an important application and research direction in the field of image translation.Its aim is to generate real images that are consistent with image descriptions using input semantic images,such as semantic segmentation maps,maps and sketches.In response to the problems of blurry image features and lack of correlation in texture details due to the lack of global information in semantic image synthesis tasks based on generative adversarial networks(GANs),this paper proposes a global information-enhanced semantic image synthesis method based on the pix2pix network model,combined with an external attention mechanism.Firstly,an external attention mechanism is introduced in the upsampling stage of the generator with a U-net structure to enhance the spatial correlation between generated image pixels.Secondly,deep residual modules are used in the upsampling layers of the generator to improve the quality of generated images while enhancing the diversity of the generated images.Finally,the discriminator incorporates global information to enhance its discrimination ability.Experimental evaluations on the Cityscape,Landscape,and Edges2shoes datasets demonstrate the effectiveness of the proposed model.Compared to the baseline model,the improved method achieves improvements of 57.37,26.74,and 1.78 in terms of the FID(Fréchet Inception Distance)metric for the Cityscape,Landscape,and Edges2shoes datasets,respectively.The results show that the proposed model can effectively utilize global information to enhance the correlation of texture details in generated images and improve the quality of generated images.

作者刘勇李俊岐陈永强 LIU Yong;LI Junqi;CHEN Yongqiang(School of Computer&Artificial Intelligence,Wuhan Textile University;Engineering Research Center of Hubei Province for Clothing Information,Wuhan 430200,China)

机构地区武汉纺织大学计算机与人工智能学院湖北省服装信息化工程技术研究中心

出处《软件导刊》 2024年第10期214-220,共7页 Software Guide

关键词图像翻译语义图像合成生成对抗网络深度学习计算机视觉 image-to-image translation semantic image synthesis generative adversarial network deep learning computer vision

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1王胜平,刘娉婷,陈晓红,陈志高.基于轻量化YOLOv7算法的侧扫声纳图像沉船检测[J].海洋测绘,2024,44(4):21-25.
2刘云虹.意识、立场与行动:译者主体化视域下的梁宗岱文学翻译考察[J].中国翻译,2024,45(5):73-80.
3李琴,洪晓彬,宋钰.基于改进YOLOv5的道路交通标志检测算法[J].信息与电脑,2024,36(12):65-68.
4陈振,张小青,周文娟.基于轻量网络的遥感影像建筑物提取[J].北京测绘,2024,38(9):1346-1351.
5黄侃.深度学习在自动布局布线违规预测中的应用——以特征提取和样本处理为中心[J].电子元器件与信息技术,2024,8(8):93-96.
6徐硕.译者行为批评视域下《牡丹亭》文化负载词翻译策略对比研究——西利尔·白之译本和许渊冲译本的对比分析[J].中国民族博览,2024(14):217-219.
7史雪娇.浅析“三美论”在《春江花月夜》英译本中的体现[J].现代语言学,2024,12(9):182-187.
8纪建勋.新中国成立30年的中国比较文学[J].国际比较文学（中英文）,2024,7(3):128-148.

软件导刊

2024年第10期

浏览历史

内容加载中请稍等...

一种全局信息增强的语义图像合成方法

相关作者

相关机构

相关主题

浏览历史