In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.How...In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.However,the difference in the feature space between natural and adversarial examples hinders the accuracy and robustness of the model in adversarial training.This paper proposes a learnable distribution adversarial training method,aiming to construct the same distribution for training data utilizing the Gaussian mixture model.The distribution centroid is built to classify samples and constrain the distribution of the sample features.The natural and adversarial examples are pushed to the same distribution centroid to improve the accuracy and robustness of the model.The proposed method generates adversarial examples to close the distribution gap between the natural and adversarial examples through an attack algorithm explicitly designed for adversarial training.This algorithm gradually increases the accuracy and robustness of the model by scaling perturbation.Finally,the proposed method outputs the predicted labels and the distance between the sample and the distribution centroid.The distribution characteristics of the samples can be utilized to detect adversarial cases that can potentially evade the model defense.The effectiveness of the proposed method is demonstrated through comprehensive experiments.展开更多
基金supported by the National Natural Science Foundation of China(No.U21B2003,62072250,62072250,62172435,U1804263,U20B2065,61872203,71802110,61802212)the National Key R&D Program of China(No.2021QY0700)+4 种基金the Key Laboratory of Intelligent Support Technology for Complex Environments(Nanjing University of Information Science and Technology),Ministry of Education,and the Natural Science Foundation of Jiangsu Province(No.BK20200750)Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022002)Post Graduate Research&Practice Innvoation Program of Jiangsu Province(No.KYCX200974)Open Project Fund of Shandong Provincial Key Laboratory of Computer Network(No.SDKLCN-2022-05)the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)Fund and Graduate Student Scientific Research Innovation Projects of Jiangsu Province(No.KYCX231359).
文摘In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.However,the difference in the feature space between natural and adversarial examples hinders the accuracy and robustness of the model in adversarial training.This paper proposes a learnable distribution adversarial training method,aiming to construct the same distribution for training data utilizing the Gaussian mixture model.The distribution centroid is built to classify samples and constrain the distribution of the sample features.The natural and adversarial examples are pushed to the same distribution centroid to improve the accuracy and robustness of the model.The proposed method generates adversarial examples to close the distribution gap between the natural and adversarial examples through an attack algorithm explicitly designed for adversarial training.This algorithm gradually increases the accuracy and robustness of the model by scaling perturbation.Finally,the proposed method outputs the predicted labels and the distance between the sample and the distribution centroid.The distribution characteristics of the samples can be utilized to detect adversarial cases that can potentially evade the model defense.The effectiveness of the proposed method is demonstrated through comprehensive experiments.