结合语义分割图的注意力机制文本生成图像

A Semantic Segmentation Graph in Combination with Attention Mechanism Text Generation Images

下载PDF

导出

摘要针对生成对抗网络生成图像存在结构不完整、内容不真实、质量差的问题,提出一种结合语义分割图的注意力机制文本到图像生成模型(SSA-GAN)。首先采用一种简单有效的深度融合模块,以全局句子向量作为输入条件,在生成图像的同时,充分融合文本信息。其次结合语义分割图像,提取其边缘轮廓特征,为模型提供额外的生成和约束条件。然后采用注意力机制为模型提供细粒度词级信息,丰富所生成图像的细节。最后使用多模态相似度计算模型计算细粒度的图像-文本匹配损失,更好地训练生成器。通过CUB-200和Oxford-102 Flowers数据集测试并验证模型,结果表明:所提模型(SSA-GAN)与StackGAN、AttnGAN、DF-GAN以及RAT-GAN等模型最终生成的图像质量相比,IS指标值最高分别提升了13.7%和43.2%,FID指标值最高分别降低了34.7%和74.9%,且具有更好的可视化效果,证明了所提方法的有效性。 Aimed at the problems that generative adversarial network is incomplete in structure,unreal in content and poor in quality of images generated,an attention mechanism text-to-image generation model combined with semantic segmentation graph(SSA-GAN)is proposed.First,taking global sentence vectors as input conditions,a simple and effective deep fusion module is utilized for fully fusing text information while generating images are generating simultaneously.Second,the semantically segmented images are combined to extract their edge profile features to provide additional generative and constraint conditions for the model,and the attention mechanism is used to provide fine-grained word-level information for the model to enrich the details of the generated images.Finally,a multimodal similarity computation model is used to compute fine-grained image-text matching loss to further train the generator.The model is tested and validated by CUB-200 and Oxford-102 Flowers datasets,and the results show that the proposed model(SSA-GAN)improves the quality of the final generated images.Compared to the models such as Stack-GAN,AttnGAN,DF-GAN,and RAT-GAN,the IS increases in metrics values by 13.7%and 43.2%,respectively.And the FID in metric values is reduced to 34.7%and 74.9%,respectively.

作者梁成名李云红李丽敏苏雪平朱绵云朱耀麟 LIANG Chengming;LI Yunhong;LI Limin;SU Xueping;ZHU Mianyun;ZHU Yaolin(School of Electronics and Information,Xi’an Polytechnic University,Xi’an 710048,China)

机构地区西安工程大学电子信息学院

出处《空军工程大学学报》 CSCD 北大核心 2024年第4期118-127,共10页 Journal of Air Force Engineering University

基金国家自然科学基金(62203344) 陕西省自然科学基础研究重点项目(2022JZ-35) 陕西高校青年创新团队项目。

关键词文本生成图像语义分割图像生成对抗网络注意力机制仿射变换 text generates images semantic segmentation image attention mechanism generate adversarial network affine transformation

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1许成君.基于改进EfficientNet的细粒度图像识别[J].舰船电子工程,2024,44(5):116-119.
2胡玺文,丁墨涵,程嘉宁,俞悠乐,王钰琦,赵浙玥,沈宸言,任易,沈辰熙,冯于飞.天才小画家[J].英语角,2024(19):60-65.
3谷峰.上古汉语否定词“莫”研究的若干议题[J].中国语文,2024(4):471-485.
4杨蕾,苏依拉,仁庆道尔吉,吉亚图,乌尼尔,路敏.基于语义规则增强的蒙古语情感分布学习[J].计算机工程与设计,2024,45(7):2082-2089.
5李欣,焦立男,柳有权,马彩莎.一种基于改进SIFT的视频稳像方法[J].计算机与现代化,2024(6):43-50.
6张可,艾中良,刘忠麟,顾平莉,刘学林.基于多元组匹配损失的司法论辩理解方法[J].计算机与现代化,2024(6):115-120.
7叶月明,曹晓初,任浩然,张春燕.应用自注意力机制对抗网络进行海洋多次波压制方法研究[J].石油地球物理勘探,2024,59(3):454-464.
8黄伟超,潘凯,李晓明,郭佳文,姚青荣,王昌明,王晨晨,马朝扬,陈建彬,黄彩敏,陶理科,陆宇航,卢照.Cu掺杂对Ce-Nd-Fe-B合金结构与磁性能的影响[J].大众科技,2024,26(2):123-126.
9蒋海浪,刘建明,王明文.基于对抗擦除的细粒度图像数据增强方法[J].计算机与数字工程,2024,52(5):1482-1487.
10侯朝山.Cuberg的锂电池组测试验证[J].国际航空,2024(6):80-80.

空军工程大学学报

2024年第4期

浏览历史

内容加载中请稍等...

结合语义分割图的注意力机制文本生成图像

相关作者

相关机构

相关主题

浏览历史