摘要
深度神经网络易受对抗样本攻击的影响并产生错误输出,传统的生成对抗样本的方法都是从优化角度生成对抗样本.文中提出基于生成对抗网络(GAN)的对抗样本生成方法,使用GAN进行白盒目标攻击,训练好的生成器对输入样本产生扰动,生成对抗样本.使用四种损失函数约束生成对抗样本的质量并提高攻击成功率.在MNIST、CIFAR-10、ImageNet数据集上的大量实验验证文中方法的有效性,文中方法的攻击成功率较高.
Deep neural networks(DNNs)are easily affected by adversarial examples and consequently generate wrong outputs.Adversarial examples are generated by the traditional methods from an optimization perspective.In this paper,a method for generating adversarial examples is proposed with generative adversarial network(GAN)and GAN is exploited for target attack in the white-box setting.Adversarial perturbations are generated by a trained generator to form adversarial examples.Four kinds of loss functions are utilized to constrain the quality of adversarial examples and improve attack success rates.The effectiveness of the proposed method is testified through extensive experiments on MNIST,CIFAR-10 and ImageNet datasets and the proposed method produces higher attack success rates.
作者
张高志
刘新平
邵明文
ZHANG Gaozhi;LIU Xinping;SHAO Mingwen(College of Computer Science and Technology,China University of Petroleum,Qingdao 266580)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2020年第9期830-838,共9页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.61673396)资助。
关键词
对抗样本
生成对抗网络(GAN)
目标攻击
白盒攻击
Adversarial Example
Generative Adversarial Network(GAN)
Target Attack
White-Box Attack