摘要
为了避免现有的单样本图像域自适应算法在反转重建过程中丢失内容信息的现象,提出一种利用CLIP(contrastive language-image pretraining)和ViT(vision transformer)双指导扩散模型去噪、实现内容对齐的单样本图像域自适应算法。首先设计一种基于扩散模型的域反转算法,将位于目标域的图像通过预训练的扩散模型反转到源域,从而获得了内容相同但域信息不同的图像对。其次,将图像对映射到CLIP模型隐空间中,通过内容主导和域主导的2个方向分别顾及内容信息和域信息;同时,将图像对映射到ViT模型隐空间中,通过对比学习的方式分别约束内容信息和域信息。最后,使用条件化指导的去噪方式,实现任意源域图像到目标域的转换。此外,该算法也适用于未见域间转换和多属性编辑的任务。定性和定量的实验结果证明,该算法相对于其他先进算法在多个性能指标上提升2%~27%。
In order to avoid the phenomenon of content information missing in the reverse reconstruction process via existing one-shot image domain adaptive algorithms,a new approach was proposed to denoise by taking advantage of contrastive language-image pretraining(CLIP)and vision transformer(ViT)dual-guided diffusion model,resulting in a one-shot image domain adaptation algo-rithm for content alignment.Firstly,domain inversion algorithm based on a diffusion model was proposed,which could invert images in the target domain to the source domain using a pre-trained diffusion model.The image pairs were obtained with same con-tent but different domain information.Then the image pairs were mapped into the implicit space of the CLIP model,taking into account the content and domain information through two directions of content dominance and domain dominance,respectively.Addi-tionally,the image pairs were mapped into the implicit space of the ViT model,with content and domain information constrained separately through contrastive learning.Finally,the conditionally guided denoising method was used to convert arbitrary source domain images to target domains.The proposed algorithm could also be applied to tasks including unseen domain conversion and multi-attribute editing.Qualitative and quantitative experimental results demonstrate that the algorithm improves the multiple perfor-mance indicators from 2%to 27%compared to other advanced algorithms.
作者
张研博
普园媛
赵征鹏
阳秋霞
徐丹
李思奇
ZHANG Yanbo;PU Yuanyuan;ZHAO Zhengpeng;YANG Qiuxia;XU Dan;LI Siqi(School of Information Science and Engineering,Yunnan University,Kunming 650504,China;Internet of Things Technol-ogy and Application Key Laboratory of Universities in Yunnan(Yunnan University),Kunming 650504,China)
出处
《中国科技论文》
CAS
2024年第2期186-192,共7页
China Sciencepaper
基金
国家自然科学基金资助项目(62362070)
云南省科技厅应用基础研究计划重点项目(202001BB050043)。
关键词
单样本图像域自适应
双指导扩散模型
内容对齐
域反转
条件化指导去噪
one-shot image domain adaptation
dual guidance diffusion model
content alignment
domain inversion
conditional guided denoising