摘要
对抗攻击通过在神经网络模型的输入样本上添加经设计的扰动,使模型高置信度地输出错误结果。对抗攻击研究主要针对单一模型应用场景,对多模型的攻击主要通过跨模型迁移攻击来实现,而关于跨模型通用攻击方法的研究很少。通过分析多模型攻击扰动的几何关系,明确了不同模型间对抗方向的正交性和对抗方向与决策边界间的正交性,并据此设计了跨模型通用攻击算法和相应的优化策略。在CIFAR10、SVHN数据集和六种常见神经网络模型上,对所提算法进行了多角度的跨模型对抗攻击验证。实验结果表明,给定实验场景下的算法攻击成功率为1.0,二范数模长不大于0.9,相较于跨模型迁移攻击,所提算法在六种模型上的平均攻击成功率最多提高57%,并且具有更好的通用性。
Adversarial attacks add designed perturbations to the input samples of neural network models to make them output wrong results with high confidence.The research on adversarial attacks mainly aim at the application scenarios of a single model,and the attacks on multiple models are mainly realized through cross-model transfer attacks,but there are few studies on universal cross-model attack methods.By analyzing the geometric relationship of multi-model attack perturbations,the orthogonality of the adversarial directions of different models and the orthogonality of the adversarial direction and the decision boundary of a single model were clarified,and the universal cross-model attack algorithm and corresponding optimization strategy were designed accordingly.On CIFAR10,SVHN datasets and six common neural network models,the proposed algorithm was verified by multi-angle cross-model adversarial attacks.Experimental results show that the attack success rate of the algorithm in a given experimental scenario is 1.0,and the L2-norm is not greater than 0.9.Compared with the cross-model transfer attack,the proposed algorithm has the average attack success rate on the six models increased by up to 57%and has better universality.
作者
张济慈
范纯龙
李彩龙
郑学东
ZHANG Jici;FAN Chunlong;LI Cailong;ZHENG Xuedong(School of Computer Science,Shenyang Aerospace University,Shenyang Liaoning 110136,China)
出处
《计算机应用》
CSCD
北大核心
2023年第11期3428-3435,共8页
journal of Computer Applications
基金
国家自然科学基金资助项目(61972266)。
关键词
深度学习
对抗样本生成
对抗攻击
跨模型攻击
分类器
deep learning
adversarial sample generation
adversarial attack
cross-model attack
classifier