期刊文献+

基于生成对抗网络的文本两阶段生成高质量图像方法 被引量:1

Generative adversarial network based two-stage generation of high-quality images from text
下载PDF
导出
摘要 为了解决传统文本生成图像方法生成图像质量差和文本描述与生成图像不一致问题,以多种损失函数为约束,提出深度融合注意力的生成对抗网络方法(DFA-GAN).采用两阶段图像生成,以单级生成对抗网络(GAN)为主干,将第一阶段生成的初始模糊图像输入第二阶段,对初始图像进行高质量再生成,以提升图像的生成质量.在图像生成的第一阶段,设计视觉文本融合模块,深度融合文本特征与图像特征,将文本信息充分融合在不同尺度的图像采样过程中.在图像生成的第二阶段,为了充分融合图像特征与文本描述词特征,提出以改进后的Vision Transformer为编码器的图像生成器.定量与定性实验结果表明,对比其他主流模型,所提方法提高了生成图像的质量,与文本描述更加符合. A generative adversarial network with deep fusion attention(DFA-GAN)was proposed,using multiple loss functions as constraints,to address the issues of poor image quality and inconsistency between text descriptions and generated images in traditional text-to-image generation methods.A two-stage image generation process was employed with a single-level generative adversarial network(GAN)as the backbone.An initial blurry image which was generated in the first stage was fed into the second stage,and high-quality image regeneration was achieved to enhance the overall image generation quality.During the first stage,a visual-text fusion module was designed to deeply integrate text features and image features,and text information was adequately fused during the image sampling process at different scales.In the second stage,an image generator with an improved Vision Transformer as the encoder was proposed to fully fuse image features with text description word features.Quantitative and qualitative experimental results showed that the proposed method outperformed other mainstream models in terms of image quality improvement and alignment with text descriptions.
作者 曹寅 秦俊平 高彤 马千里 任家琪 CAO Yin;QIN Junping;GAO Tong;MA Qianli;REN Jiaqi(College of Data Science and Applications,Inner Mongolia University of Technology,Hohhot 010051,China;Inner Mongolia Autonomous Region Engineering Technology Research Center of Big Data Based Software Service,Hohhot 010000,China;Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2024年第4期674-683,共10页 Journal of Zhejiang University:Engineering Science
基金 国家自然科学基金资助项目(61962044) 内蒙古自治区自然科学基金资助项目(2019MS06005) 内蒙古自治区科技重大专项(2021ZD0015) 自治区直属高校基本科研业务费项目(JY20220327)。
关键词 文字生成图像 深度融合 生成对抗网络(GAN) 多尺度特征融合 语义一致性 text-to-image deep fusion generative adversarial network(GAN) multi-scale feature fusion semantics consistency
  • 相关文献

参考文献3

二级参考文献4

共引文献6

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部