摘要
面对对抗样本的攻击,深度神经网络是脆弱的。对抗样本是在原始输入图像上添加人眼几乎不可见的噪声生成的,从而使深度神经网络误分类并带来安全威胁。因此在深度神经网络部署前,对抗性攻击是评估模型鲁棒性的重要方法。然而,在黑盒情况下,对抗样本的攻击成功率还有待提高,即对抗样本的可迁移性有待提升。针对上述情况,提出基于图像翻转变换的对抗样本生成方法--FT-MI-FGSM(Flipping Transformation Momentum Iterative Fast Gradient Sign Method)。首先,从数据增强的角度出发,在对抗样本生成过程的每次迭代中,对原始输入图像随机翻转变换;然后,计算变换后图像的梯度;最后,根据梯度生成对抗样本以减轻对抗样本生成过程中的过拟合,并提升对抗样本的可迁移性。此外,通过使用攻击集成模型的方法,进一步提高对抗样本的可迁移性。在ImageNet数据集上验证了所提方法的有效性。相较于I-FGSM(Iterative Fast Gradient Sign Method)和MI-FGSM(Momentum I-FGSM),在攻击集成模型设置下,FT-MI-FGSM在对抗训练网络上的平均黑盒攻击成功率分别提升了26.0和8.4个百分点。
In the face of adversarial example attack,deep neural networks are vulnerable.These adversarial examples result in the misclassification of deep neural networks by adding human-imperceptible perturbations on the original images,which brings a security threat to deep neural networks.Therefore,before the deployment of deep neural networks,the adversarial attack is an important method to evaluate the robustness of models.However,under the black-box setting,the attack success rates of adversarial examples need to be improved,that is,the transferability of adversarial examples need to be increased.To address this issue,an adversarial example method based on image flipping transform,namely FT-MI-FGSM(Flipping Transformation Momentum Iterative Fast Gradient Sign Method),was proposed.Firstly,from the perspective of data augmentation,in each iteration of the adversarial example generation process,the original input image was flipped randomly.Then,the gradient of the transformed images was calculated.Finally,the adversarial examples were generated based on this gradient,so as to alleviate the overfitting in the process of adversarial example generation and to improve the transferability of adversarial examples.In addition,the method of attacking ensemble models was used to further enhance the transferability of adversarial examples.Extensive experiments on ImageNet dataset demonstrated the effectiveness of the proposed algorithm.Compared with I-FGSM(Iterative Fast Gradient Sign Method)and MI-FGSM(Momentum I-FGSM),the average black-box attack success rate of FT-MI-FGSM on the adversarially training networks is improved by 26.0 and 8.4 percentage points under the attacking ensemble model setting,respectively.
作者
杨博
张恒巍
李哲铭
徐开勇
YANG Bo;ZHANG Hengwei;LI Zheming;XU Kaiyong(Information Engineering University,Zhengzhou Henan 450001,China;PLA Army General Staff,Beijing 100000,China)
出处
《计算机应用》
CSCD
北大核心
2022年第8期2319-2325,共7页
journal of Computer Applications
基金
国家重点研发计划项目(2017YFB0801900)。
关键词
图像翻转变换
对抗样本
黑盒攻击
深度神经网络
可迁移性
image flipping transform
adversarial example
black-box attack
deep neural network
transferability