摘要
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.
作者
胡振涛
HU Chonghao
YANG Haoran
SHUAI Weiwei
HU Zhentao;HU Chonghao;YANG Haoran;SHUAI Weiwei(School of Artificial Intelligence,Henan University,Zhengzhou 450046,P.R.China;95795 Troops of the PLA,Guilin 541003,P.R.China)
基金
the National Natural Science Foundation of China(No.61976080)
the Academic Degrees&Graduate Education Reform Project of Henan Province(No.2021SJGLX195Y)
the Teaching Reform Research and Practice Project of Henan Undergraduate Universities(No.2022SYJXLX008)
the Key Project on Research and Practice of Henan University Graduate Education and Teaching Reform(No.YJSJG2023XJ006)。