Image classifiers that based on Deep Neural Networks(DNNs)have been proved to be easily fooled by well-designed perturbations.Previous defense methods have the limitations of requiring expensive computation or reducin...Image classifiers that based on Deep Neural Networks(DNNs)have been proved to be easily fooled by well-designed perturbations.Previous defense methods have the limitations of requiring expensive computation or reducing the accuracy of the image classifiers.In this paper,we propose a novel defense method which based on perceptual hash.Our main goal is to destroy the process of perturbations generation by comparing the similarities of images thus achieve the purpose of defense.To verify our idea,we defended against two main attack methods(a white-box attack and a black-box attack)in different DNN-based image classifiers and show that,after using our defense method,the attack-success-rate for all DNN-based image classifiers decreases significantly.More specifically,for the white-box attack,the attack-success-rate is reduced by an average of 36.3%.For the black-box attack,the average attack-success-rate of targeted attack and non-targeted attack has been reduced by 72.8%and 76.7%respectively.The proposed method is a simple and effective defense method and provides a new way to defend against adversarial samples.展开更多
基金The work is supported by the National Key Research Development Program of China(2016QY01W0200)the National Natural Science Foundation of China NSFC(U1636101,U1736211,U1636219).
文摘Image classifiers that based on Deep Neural Networks(DNNs)have been proved to be easily fooled by well-designed perturbations.Previous defense methods have the limitations of requiring expensive computation or reducing the accuracy of the image classifiers.In this paper,we propose a novel defense method which based on perceptual hash.Our main goal is to destroy the process of perturbations generation by comparing the similarities of images thus achieve the purpose of defense.To verify our idea,we defended against two main attack methods(a white-box attack and a black-box attack)in different DNN-based image classifiers and show that,after using our defense method,the attack-success-rate for all DNN-based image classifiers decreases significantly.More specifically,for the white-box attack,the attack-success-rate is reduced by an average of 36.3%.For the black-box attack,the average attack-success-rate of targeted attack and non-targeted attack has been reduced by 72.8%and 76.7%respectively.The proposed method is a simple and effective defense method and provides a new way to defend against adversarial samples.