CRD-CGAN:category-consistent and relativistic constraints for diverse text-to-image generation

导出

摘要 Generating photo-realistic images from a text description is a challenging problem in computer vision.Previous works have shown promising performance to generate synthetic images conditional on text by Generative Adversarial Networks(GANs).In this paper,we focus on the category-consistent and relativistic diverse constraints to optimize the diversity of synthetic images.Based on those constraints,a category-consistent and relativistic diverse conditional GAN(CRD-CGAN)is proposed to synthesize K photo-realistic images simultaneously.We use the attention loss and diversity loss to improve the sensitivity of the GAN to word attention and noises.Then,we employ the relativistic conditional loss to estimate the probability of relatively real or fake for synthetic images,which can improve the performance of basic conditional loss.Finally,we introduce a category-consistent loss to alleviate the over-category issues between K synthetic images.We evaluate our approach using the Caltech-UCSD Birds-200-2011,Oxford 102 flower and MS COCO 2014 datasets,and the extensive experiments demonstrate superiority of the proposed method in comparison with state-of-the-art methods in terms of photorealistic and diversity of the generated synthetic images.

作者 Tao HU Chengjiang LONG Chunxia XIAO

机构地区 College of Intelligent Systems Science and Engineering School of Computer Science Key Laboratory of Performing Art Equipment&System Technology Meta Reality Labs

出处《Frontiers of Computer Science》 SCIE EI CSCD 2024年第1期61-75,共15页 中国计算机科学前沿（英文版）

基金 supported by the National Natural Science Foundation of China(Grant Nos.61972298 and 61962019) by the National Cultural and Tourism Science and Technology Innovation Project(2021064) the Training Program of High Level Scientific Research Achievements of Hubei Minzu University under Grant PY22011.

关键词 text-to-image diverse conditional GAN relativi-stic category-consistent

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1杨大为,刘志权,王红霞.结合混合卷积和多尺度注意力的视频异常检测算法[J].液晶与显示,2024,39(8):1128-1137.

Frontiers of Computer Science

2024年第1期

浏览历史

内容加载中请稍等...

CRD-CGAN:category-consistent and relativistic constraints for diverse text-to-image generation

相关作者

相关机构

相关主题

浏览历史