The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera...The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.展开更多
Generative adversarial networks(GANs) have become a competitive method among computer vision tasks. There have been many studies devoted to utilizing generative network to do generative tasks, such as images synthesis...Generative adversarial networks(GANs) have become a competitive method among computer vision tasks. There have been many studies devoted to utilizing generative network to do generative tasks, such as images synthesis. In this paper, a semi-supervised learning scheme is incorporated with generative adversarial network on image classification tasks to improve the image classification accuracy. Two applications of GANs are mainly focused on: semi-supervised learning and generation of images which can be as real as possible. The whole process is divided into two sections. First, only a small part of the dataset is utilized as labeled training data. And then a huge amount of samples generated from the generator is added into the training samples to improve the generalization of the discriminator. Through the semi-supervised learning scheme, full use of the unlabeled data is made which may contain potential information. Thus, the classification accuracy of the discriminator can be improved. Experimental results demonstrate the improvement of the classification accuracy of discriminator among different datasets, such as MNIST, CIFAR-10.展开更多
Recently,the evolution of Generative Adversarial Networks(GANs)has embarked on a journey of revolutionizing the field of artificial and computational intelligence.To improve the generating ability of GANs,various loss...Recently,the evolution of Generative Adversarial Networks(GANs)has embarked on a journey of revolutionizing the field of artificial and computational intelligence.To improve the generating ability of GANs,various loss functions are introduced to measure the degree of similarity between the samples generated by the generator and the real data samples,and the effectiveness of the loss functions in improving the generating ability of GANs.In this paper,we present a detailed survey for the loss functions used in GANs,and provide a critical analysis on the pros and cons of these loss functions.First,the basic theory of GANs along with the training mechanism are introduced.Then,the most commonly used loss functions in GANs are introduced and analyzed.Third,the experimental analyses and comparison of these loss functions are presented in different GAN architectures.Finally,several suggestions on choosing suitable loss functions for image synthesis tasks are given.展开更多
Super-resolution reconstruction in medical imaging has become more demanding due to the necessity of obtaining high-quality images with minimal radiation dose,such as in low-field magnetic resonance imaging(MRI).Howev...Super-resolution reconstruction in medical imaging has become more demanding due to the necessity of obtaining high-quality images with minimal radiation dose,such as in low-field magnetic resonance imaging(MRI).However,image super-resolution reconstruction remains a difficult task because of the complexity and high textual requirements for diagnosis purpose.In this paper,we offer a deep learning based strategy for reconstructing medical images from low resolutions utilizing Transformer and generative adversarial networks(T-GANs).The integrated system can extract more precise texture information and focus more on important locations through global image matching after successfully inserting Transformer into the generative adversarial network for picture reconstruction.Furthermore,we weighted the combination of content loss,adversarial loss,and adversarial feature loss as the final multi-task loss function during the training of our proposed model T-GAN.In comparison to established measures like peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM),our suggested T-GAN achieves optimal performance and recovers more texture features in super-resolution reconstruction of MRI scanned images of the knees and belly.展开更多
现有的客观图像质量评价方法用于GAN生成图像质量评价时,往往出现与人的主观评价不一致的情况.针对这个问题,提出了一种更符合人类视觉感知的GAN生成图像质量客观评价方法AJ-GIQA(attention and just noticeable difference based gener...现有的客观图像质量评价方法用于GAN生成图像质量评价时,往往出现与人的主观评价不一致的情况.针对这个问题,提出了一种更符合人类视觉感知的GAN生成图像质量客观评价方法AJ-GIQA(attention and just noticeable difference based generated image quality assessment).首先,模拟人类视觉系统的失真敏感度特性,对GAN生成图像进行预处理,得到其最小可觉差图;然后,将注意力模块引入特征提取网络,模拟人类视觉系统的注意力机制,获取图像的显著性特征;最后,将图像特征输入结合语义信息的质量预测网络,基于图像内容综合评价GAN生成图像的质量.在GAN生成图像数据集上的实验结果表明,AJ-GIQA的评价结果与主观平均意见得分有更高的一致性;在图像质量排序一致性上的实验结果表明,AJ-GIQA的准确率在LGIQA-LSUN-cat数据集上达到了最优,和SFA方法相比性能提高了0.267;在泛化性能上的实验结果表明,与最先进的HyperIQA方法相比,AJ-GIQA在数据集PIPAL的Pearson线性相关系数提高了0.027.展开更多
基金the National Natural Science Foundation of China(No.61976080)the Academic Degrees&Graduate Education Reform Project of Henan Province(No.2021SJGLX195Y)+1 种基金the Teaching Reform Research and Practice Project of Henan Undergraduate Universities(No.2022SYJXLX008)the Key Project on Research and Practice of Henan University Graduate Education and Teaching Reform(No.YJSJG2023XJ006)。
文摘The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.
基金Supported by the National Natural Science Foundation of China(No.61501457)National Key Technology R&D Program(No.2015BAK21B00)
文摘Generative adversarial networks(GANs) have become a competitive method among computer vision tasks. There have been many studies devoted to utilizing generative network to do generative tasks, such as images synthesis. In this paper, a semi-supervised learning scheme is incorporated with generative adversarial network on image classification tasks to improve the image classification accuracy. Two applications of GANs are mainly focused on: semi-supervised learning and generation of images which can be as real as possible. The whole process is divided into two sections. First, only a small part of the dataset is utilized as labeled training data. And then a huge amount of samples generated from the generator is added into the training samples to improve the generalization of the discriminator. Through the semi-supervised learning scheme, full use of the unlabeled data is made which may contain potential information. Thus, the classification accuracy of the discriminator can be improved. Experimental results demonstrate the improvement of the classification accuracy of discriminator among different datasets, such as MNIST, CIFAR-10.
文摘Recently,the evolution of Generative Adversarial Networks(GANs)has embarked on a journey of revolutionizing the field of artificial and computational intelligence.To improve the generating ability of GANs,various loss functions are introduced to measure the degree of similarity between the samples generated by the generator and the real data samples,and the effectiveness of the loss functions in improving the generating ability of GANs.In this paper,we present a detailed survey for the loss functions used in GANs,and provide a critical analysis on the pros and cons of these loss functions.First,the basic theory of GANs along with the training mechanism are introduced.Then,the most commonly used loss functions in GANs are introduced and analyzed.Third,the experimental analyses and comparison of these loss functions are presented in different GAN architectures.Finally,several suggestions on choosing suitable loss functions for image synthesis tasks are given.
文摘Super-resolution reconstruction in medical imaging has become more demanding due to the necessity of obtaining high-quality images with minimal radiation dose,such as in low-field magnetic resonance imaging(MRI).However,image super-resolution reconstruction remains a difficult task because of the complexity and high textual requirements for diagnosis purpose.In this paper,we offer a deep learning based strategy for reconstructing medical images from low resolutions utilizing Transformer and generative adversarial networks(T-GANs).The integrated system can extract more precise texture information and focus more on important locations through global image matching after successfully inserting Transformer into the generative adversarial network for picture reconstruction.Furthermore,we weighted the combination of content loss,adversarial loss,and adversarial feature loss as the final multi-task loss function during the training of our proposed model T-GAN.In comparison to established measures like peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM),our suggested T-GAN achieves optimal performance and recovers more texture features in super-resolution reconstruction of MRI scanned images of the knees and belly.
文摘现有的客观图像质量评价方法用于GAN生成图像质量评价时,往往出现与人的主观评价不一致的情况.针对这个问题,提出了一种更符合人类视觉感知的GAN生成图像质量客观评价方法AJ-GIQA(attention and just noticeable difference based generated image quality assessment).首先,模拟人类视觉系统的失真敏感度特性,对GAN生成图像进行预处理,得到其最小可觉差图;然后,将注意力模块引入特征提取网络,模拟人类视觉系统的注意力机制,获取图像的显著性特征;最后,将图像特征输入结合语义信息的质量预测网络,基于图像内容综合评价GAN生成图像的质量.在GAN生成图像数据集上的实验结果表明,AJ-GIQA的评价结果与主观平均意见得分有更高的一致性;在图像质量排序一致性上的实验结果表明,AJ-GIQA的准确率在LGIQA-LSUN-cat数据集上达到了最优,和SFA方法相比性能提高了0.267;在泛化性能上的实验结果表明,与最先进的HyperIQA方法相比,AJ-GIQA在数据集PIPAL的Pearson线性相关系数提高了0.027.
文摘针对现有的图像修复方法在面对大规模图像缺损和不规则破损区域修复时,修复结果出现生成结构与原图像语义不符以及纹理细节模糊等问题,本文提出一种利用生成边缘图的多尺度特征融合图像修复算法——MSFGAN(multi-scale feature network model based on edge condition).模型采用两阶段网络设计,使用边缘图作为修复条件对修复结果进行结构约束.首先,使用Canny算子提取待修复图像的边缘图进行完整边缘图生成;然后利用完整的边缘图结合待修复图像进行图像修复.为了弥补图像修复算法中经常出现的问题,提出一种融入了注意力机制的多尺度特征融合模块(attention mechanism multi-fusion convolution block,AM block),实现受损图像的特征提取和特征融合.在图像修复网络解码器部分引入跳跃链接,将高级语义提取和底层特征进行融合实现高质量细节纹理修复.在CelebA和Places2数据集上的测试结果显示,MSFGAN修复质量上比当前修复方法有一定提升,其中在20%–30%掩码比例中,SSIM平均提升0.0291,PSNR提升1.535 dB,使用消融实验验证了当前优化和创新点在图像修复任务中的有效性.