Reweighting adversarial examples during training plays an essential role in improving the robustness of neural networks,which lies in the fact that examples closer to the decision boundaries are much more vulnerable t...Reweighting adversarial examples during training plays an essential role in improving the robustness of neural networks,which lies in the fact that examples closer to the decision boundaries are much more vulnerable to being attacked and should be given larger weights.The probability margin(PM)method is a promising approach to continuously and path-independently mea-suring such closeness between the example and decision boundary.However,the performance of PM is limited due to the fact that PM fails to effectively distinguish the examples having only one misclassified category and the ones with multiple misclassified categories,where the latter is closer to multi-classification decision boundaries and is supported to be more critical in our observation.To tackle this problem,this paper proposed an improved PM criterion,called confused-label-based PM(CL-PM),to measure the closeness mentioned above and reweight adversarial examples during training.Specifi-cally,a confused label(CL)is defined as the label whose prediction probability is greater than that of the ground truth label given a specific adversarial example.Instead of considering the discrepancy between the probability of the true label and the probability of the most misclassified label as the PM method does,we evaluate the closeness by accumulating the probability differences of all the CLs and ground truth label.CL-PM shares a negative correlation with data vulnerability:data with larger/smaller CL-PM is safer/riskier and should have a smaller/larger weight.Experiments demonstrated that CL-PM is more reliable in indicating the closeness regarding multiple misclassified categories,and reweighting adversarial training based on CL-PM outperformed state-of-the-art counterparts.展开更多
基金supported by the National Natural Science Foundation of China (No.62072127,No.62002076,No.61906049)Natural Science Foundation of Guangdong Province (No.2023A1515011774,No.2020A1515010423)+3 种基金Project 6142111180404 supported by CNKLSTISS,Science and Technology Program of Guangzhou,China (No.202002030131)Guangdong basic and applied basic research fund joint fund Youth Fund (No.2019A1515110213)Open Fund Project of Fujian Provincial Key Laboratory of Information Processing and Intelligent Control (Minjiang University) (No.MJUKF-IPIC202101)Scientific research project for Guangzhou University (No.RP2022003).
文摘Reweighting adversarial examples during training plays an essential role in improving the robustness of neural networks,which lies in the fact that examples closer to the decision boundaries are much more vulnerable to being attacked and should be given larger weights.The probability margin(PM)method is a promising approach to continuously and path-independently mea-suring such closeness between the example and decision boundary.However,the performance of PM is limited due to the fact that PM fails to effectively distinguish the examples having only one misclassified category and the ones with multiple misclassified categories,where the latter is closer to multi-classification decision boundaries and is supported to be more critical in our observation.To tackle this problem,this paper proposed an improved PM criterion,called confused-label-based PM(CL-PM),to measure the closeness mentioned above and reweight adversarial examples during training.Specifi-cally,a confused label(CL)is defined as the label whose prediction probability is greater than that of the ground truth label given a specific adversarial example.Instead of considering the discrepancy between the probability of the true label and the probability of the most misclassified label as the PM method does,we evaluate the closeness by accumulating the probability differences of all the CLs and ground truth label.CL-PM shares a negative correlation with data vulnerability:data with larger/smaller CL-PM is safer/riskier and should have a smaller/larger weight.Experiments demonstrated that CL-PM is more reliable in indicating the closeness regarding multiple misclassified categories,and reweighting adversarial training based on CL-PM outperformed state-of-the-art counterparts.