期刊文献+

GP-WIRGAN:梯度惩罚优化的Wasserstein图像循环生成对抗网络模型 被引量:8

GP-WIRGAN:A Novel Image Recurrent Generative Adversarial Network Model Based on Wasserstein and Gradient Penalty
下载PDF
导出
摘要 通常情形下,现有的图像生成模型都采用单次前向传播的方式生成图像,但实际中,画家通常是反复修改后才完成一幅画作的;生成对抗模型(Generative Adversarial Networks,GAN)能生成图像,但却很难训练.在保证生成图像质量的前提下,效仿作画时的不断更新迭代,以提升生成样本多样性并增强样本语义,同时引入Wasserstein距离,提出了Wasserstein图像循环生成对抗网络模型,简称WIRGAN(Wasserstein Image Recurrent Generative Adversarial Networks Model).WIRGAN定义了生成模型和判别模型,其中,生成模型是由一系列结构相同的神经网络模型组成的循环结构,用时间步骤T控制生成模型的循环次数,用于迭代式生成图像,并以最后一个循环结构的生成图像作为整个生成模型的输出;判别模型也由神经网络构建,结合权重剪枝技术,用来判别输入图像是生成的还是真实的.WIRGAN利用Wasserstein距离作为目标函数,将生成模型和判别模型进行博弈对抗训练.另外,由于模型存在难以优化的问题,本文引入了梯度惩罚来解决此类问题,进一步提出了梯度惩罚优化的Wasserstein图像循环生成对抗网络模型(Gradient Penalty Optimized Wasserstein Image Recurrent Generative Adversarial Networks Model,GP-WIRGAN).最后,WIRGAN和GP-WIRGAN在MNIST、CIFAR10、CeUN四个数据集上进行了基础学习能力、模型间GAM自比较、模型内GAM自比较、初始得分比较、图像生成可视化、时间效率比较等6组实验,采用生成对抗矩阵(Generative Adversarial Metric,GAM)和起始分数(Inception Scores)进行评估,结果表明,本文提出的WIRGAN、GP-WIRGAN具有良好的稳定性,可以生成高质量的图像. Most image generation models use a one-time image generation method,which obtains output through a single forward of generation model.But in practice,for example,painters usually repeatedly modify their paintings from coarse to fine during their creation time,which is a multi-stage process.Generative model reduces the manual marking requirements on image data,and can understand semantic meaning of the images well.The generative model can synthesize approximate real data from its learned data distribution.One of the main stream generative model is called Generative Adversarial Network(GAN).By utilizing game theory and deep learning,we can ultimately synthesize high-grade data samples based on two types of networks called generator and discriminator inside GAN model.GAN is well known for generating images,but has difficulty in training stably due to the irrational distance metric in optimizing target,which results in poorly generated sample diversity.Besides,most generative models generate images at a single cycle,but in fact,when the painter paints,he completes a painting on the basis of previous modifications.In order to guarantee the quality of the generated image and enhance the generation of sample diversity and the semantics of the sample,we simulate the process of repeating iterations and multiple modifications by the artist during painting,and generate samples using method we called“multi-generation”.We chose Wasserstein distance to measure the distance between the real data distribution and the generated data distribution,proposed a framework named Wasserstein Image Recurrent Generative Adversarial Networks(WIRGAN).WIRGAN defines a generative model and a discriminative model,the generative model is used to gradually generate images,which consists of a recurrent feedback loop structure and can handle a time step parameter T of generation to control the complexity of model.Sample generated at time t is combined with the output of time t-1 by simply adding together,the generator takes the image generated from the last time step as output.The discriminator model is also constructed by a neural network,combining weight clipping to determine whether the input image is generated or true.WIRGAN uses Wasserstein distance as cost function,which aims to decrease the discrepancy between synthesized samples and real samples,training WIRGAN in an adversarial way.In addition,gradient penalty is also used in this paper to deal training difficulty that produced by weight clipping in WIRGAN.We further propose a Gradient Penalty Optimized Wasserstein Image Recurrent Generative Adversarial Networks Model(GP-WIRGAN).Finally,we adopt Generative Adversarial Metric(GAM)and inception score to evaluate the performance of our models on the quality and diversity of the generated samples.WIRGAN and GP-WIRGAN conducted five sets of comparative experiments on four datasets including MNIST,CIFAR10,CelebA and LSUN,which are the basic learning abilities comparison,the GAM comparisons within the model,the GAM comparisons between the models,the inception score comparisons,visualization,Time efficiency comparison.Extensive experiments show the proposed model has achieved good results in both evaluation criteria,which identify that WIRGAN and GP-WIRGAN has good stability and can generate high quantity images.
作者 冯永 张春平 强保华 张逸扬 尚家兴 FENG Yong;ZHANG Chun-Ping;QIANG Bao-Hua;ZHANG Yi-Yang;SHANG Jia-Xing(College of Computer Science,Chongqing University,Chongqing 400030;Key Laboratory of Dependable Service Computing in Cyber Physical Society,Ministry of Education,Chongqing University,Chongqing 400030;Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin,Guangxi 541004;Guangxi Key Laboratory of Optoelectronic Information Processing,Guilin University of Electronic Technology,Guilin,Guangxi 541004)
出处 《计算机学报》 EI CSCD 北大核心 2020年第2期190-205,共16页 Chinese Journal of Computers
基金 国家自然科学基金(61762025) 国家重点研发计划(2017YFB1402400) 重庆市基础与前沿研究计划(cstc2017jcyjAX0340) 广西可信软件重点实验室开放课题(kx201701) 广西光电信息处理重点实验室(培育基地)基金(GD18202) 重庆市重点产业共性关键技术创新专项(cstc2017zdcy-zdyxx0047) 重庆市社会事业与民生保障科技创新专项(cstc2017shmsA20013)资助.
关键词 图像生成 生成对抗网络 Wasserstein距离 深度学习 权重剪枝 梯度惩罚 image generating generative adversarial networks Wasserstein distance deep learning weight clipping gradient penalty
  • 相关文献

参考文献1

二级参考文献8

共引文献323

同被引文献42

引证文献8

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部