The state-of-the-art deep neural networks are vulnerable to the attacks of adversarial examples with small-magnitude per-turbations.In the field of deep-learning-based automated driving,such adversarial attack threats...The state-of-the-art deep neural networks are vulnerable to the attacks of adversarial examples with small-magnitude per-turbations.In the field of deep-learning-based automated driving,such adversarial attack threats testify to the weakness of AI models.This limitation can lead to severe issues regarding the safety of the intended functionality(SOTIF)in automated driving.From the perspective of causality,the adversarial attacks can be regarded as confounding effects with spurious corre-lations established by the non-causal features.However,few previous research works are devoted to building the relationship between adversarial examples,causality,and SOTIF.This paper proposes a robust physical adversarial perturbation genera-tion method that aims at the salient image regions of the targeted attack class with the guidance of class activation mapping(CAM).With the utilization of CAM,the maximization of the confounding effects can be achieved through the intermediate variable of the front-door criterion between images and targeted attack labels.In the simulation experiment,the proposed method achieved a 94.6%targeted attack success rate(ASR)on the released dataset when the speed-speed-limit-60 km/h(speed-limit-60)signs could be attacked as speed-speed-limit-80 km/h(speed-limit-80)signs.In the real physical experiment,the targeted ASR is 75%and the untargeted ASR is 100%.Besides the state-of-the-art attack result,a detailed experiment is implemented to evaluate the performance of the proposed method under low resolutions,diverse optimizers,and multifarious defense methods.The code and data are released at the repository:https://github.com/yebin999/rp2-with-cam.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.62133011.
文摘The state-of-the-art deep neural networks are vulnerable to the attacks of adversarial examples with small-magnitude per-turbations.In the field of deep-learning-based automated driving,such adversarial attack threats testify to the weakness of AI models.This limitation can lead to severe issues regarding the safety of the intended functionality(SOTIF)in automated driving.From the perspective of causality,the adversarial attacks can be regarded as confounding effects with spurious corre-lations established by the non-causal features.However,few previous research works are devoted to building the relationship between adversarial examples,causality,and SOTIF.This paper proposes a robust physical adversarial perturbation genera-tion method that aims at the salient image regions of the targeted attack class with the guidance of class activation mapping(CAM).With the utilization of CAM,the maximization of the confounding effects can be achieved through the intermediate variable of the front-door criterion between images and targeted attack labels.In the simulation experiment,the proposed method achieved a 94.6%targeted attack success rate(ASR)on the released dataset when the speed-speed-limit-60 km/h(speed-limit-60)signs could be attacked as speed-speed-limit-80 km/h(speed-limit-80)signs.In the real physical experiment,the targeted ASR is 75%and the untargeted ASR is 100%.Besides the state-of-the-art attack result,a detailed experiment is implemented to evaluate the performance of the proposed method under low resolutions,diverse optimizers,and multifarious defense methods.The code and data are released at the repository:https://github.com/yebin999/rp2-with-cam.