摘要
图像翻译任务是计算机视觉领域一个重要的研究方向,在图像风格化、超分辨率图像生成等视觉领域都有着广泛的应用。针对图像翻译任务中语义信息标注成本高、数据集通常标注困难的问题,提出了一种基于原型修正的小样本语义图像翻译算法,该算法主要包含StyleGAN、语义相似度回归器、pSp编码器模块。首先,为了降低模型对标签图像的依赖,该算法使用StyleGAN预训练模型充当生成器,增加小样本场景下的训练样本数和提升模型生成的多样性。其次,考虑到样本语义类内差异,该算法设计语义相似度回归器对原型进行修正,提升伪标签的准确率,增强模型优化效果。然后,结合标签图像和合成图像的特征图以及原型向量,实现语义信息的循环合成,构建出自监督损失函数以避免语义相似度回归器训练的标签信息需求,并利用伪标签图像对pSp编码器继续进行训练,实现语义图像翻译任务。最后,实验结果验证了所提算法在泛化性能和合成图像的多样性方面均优于经典算法。
Image translation plays a vital role in computer vision and has extensive applications in visual fields,such as image sty-lization and image super-resolution generation.Datasets are frequently challenging to label,and semantic labeling has substantial costs.This paper proposes a few-shot semantic image translation framework based on prototype correction,mainly encompassing the StyleGAN module,semantic similarity regressor module,and pSp encoder module.First,to decrease the dependence of the model on the labeled image,our framework utilizes the StyleGAN pre-trained model as a generator,which expands the number of training samples in few-shot and the diversity of image generation.Second,considering the variations within the sample semantic class,our framework designs a semantic similarity regressor to correct the prototype,improving the accuracy of the pseudo-label and enhancing the model optimization effect.Third,the cyclic synthesis of semantic information is realized by combining label feature maps,synthetic feature maps and prototype vectors.Meanwhile,a self-supervised loss function is constructed to avoid the label information requirements of semantic similarity regressor training.Then the pSp encoder is trained with pseudo-tag images,and the task of semantic image synthesis is achieved.Experimental results show that the proposed framework is superior to classical frameworks in terms of excellent generalization performance and diversity of synthesized images.
作者
何知霖
顾天昊
徐冠华
HE Zhilin;GU Tianhao;XU Guanhua(School of Automation,Qingdao University,Qingdao,Shandong 260000,China;Institute of Intelligent Unmanned System,Qingdao University,Qingdao,Shandong 260000,China)
出处
《计算机科学》
CSCD
北大核心
2024年第8期224-231,共8页
Computer Science
基金
国家自然科学基金(62076094,61773227)
中国博士后科学基金(2022M721744)
山东省博士后创新人才支持计划(SDBX2022023)。
关键词
图像翻译
原型修正
小样本学习
对抗生成网络
Image translation
Prototype correction
Few-shot learning
Generative adversarial network