摘要
Studies have shown that deep neural networks(DNNs) are vulnerable to adversarial examples(AEs) that induce incorrect behaviors. To defend these AEs, various detection techniques have been developed. However, most of them only appear to be effective against specific AEs and cannot generalize well to different AEs. We propose a new detection method against AEs based on the maximum channel of saliency maps(MCSM). The proposed method can alter the structure of adversarial perturbations and preserve the statistical properties of images at the same time. We conduct a complete evaluation on AEs generated by 6 prominent adversarial attacks on the Image Net large scale visual recognition challenge(ILSVRC) 2012 validation sets. The experimental results show that our method performs well on detecting various AEs.
基金
supported by the Joint Funds of the National Natural Science Foundation of China (No.U1536122)
the Science and Technology Commission Major Special Projects of Tianjin of China (No.15ZXDSGX00030)
the Tianjin Municipal Commission of Education of China (No.2021YJSB252)