摘要
在图像分类模型的攻防研究中,大部分后门攻击都是基于白盒模式的场景下发生的,攻击者需要控制训练数据和训练过程才可以实现后门攻击。这一条件导致了后门攻击难以在现实场景下发生。为了提高后门攻击的可行性,本文采用一种基于“灰盒”模式下的攻击场景,攻击者只需要控制训练数据,不必参与训练过程也可以实现后门攻击。现有的后门攻击方式通常是在干净样本中添加补丁作为后门触发器,这样的攻击方式很容易被人工发现或者被防御模型检测出来,而使用对抗攻击可以减少扰动量的异常分布,从而提高有毒样本的隐蔽性。基于这一思想,提出一种满足高斯分布的对抗扰动算法来生成后门触发器,与现有的对抗扰动不同,通过往每轮的对抗迭代中添加一次满足高斯分布的噪声,最后迭代完成后生成的后门触发器更加稳定和隐蔽,在逃避防御检测方面也有更好的效果。实验结果表明,只有平均不到10%的有毒样本会被防御检测到异常,相比于传统的方法被检测率降低了13%左右。
In the research of attacks and defenses for image classification models,most backdoor attacks occur in the white-box scenario,where attackers need to control the training data and process to implement the backdoor attack,which makes it difficult for backdoor attacks to occur in real-world scenarios.To improve the feasibility of backdoor attacks,an attack scenario based on the"gray-box"mode has been proposed in the present study,where attackers only need to control the training data and do not need to participate in the training process to implement the backdoor attack.The typical backdoor attack methods existed add patches as backdoor triggers into the clean samples,which is easy to be discovered by manual inspection or detected by defense models.The abnormal distribution of perturbation can be reduced by using the adversarial attacks,improving the concealment of poison samples.Based on this idea,an adversarial perturbation algorithm that satisfy the Gaussian distribution is proposed to generate backdoor triggers.Unlike the existing adversarial perturbations,the algorithm adds Gaussian noise to each round of the adversarial iteration,making the generated backdoor triggers more stable and covert,which has a good effect on escaping the defense detection.The experimental results show that only less than 10%of poison samples on average are detected by defense mechanisms,whose detection rate is reduced by around 13%compared to traditional methods.
作者
袁国桃
黄洪
李心
杜瑞
王兆莲
YUAN Guotao;HUANG Hong;LI Xin;DU Rui;WANG Zhaolian(School of Computer Science and Engineering,Sichuan University of Science&Engineering,Yibin 644000,China;Sichuan Province University Key Laboratory of Bridge Non-destruction Detecting and Engineering Computing,Yibin 644000,China)
出处
《四川轻化工大学学报(自然科学版)》
CAS
2023年第4期52-60,共9页
Journal of Sichuan University of Science & Engineering(Natural Science Edition)
基金
四川省科技计划项目(2020YFG0151)
四川轻化工大学人才引进项目(2021RC15)
四川轻化工大学研究生创新基金项目(Y2022185)
桥梁无损检测与工程计算四川省高校重点实验室开放基金项目(2022QYJ06)。
关键词
图像分类模型
后门攻击
高斯分布
对抗扰动
image classification model
backdoor attack
Gaussian distribution
adversarial perturbation