摘要
In view of the fact that adversarial examples can lead to high-confidence erroneous outputs of deep neural networks,this study aims to improve the safety of deep neural networks by distinguishing adversarial examples.A classification model based on filter residual network structure is used to accurately classify adversarial examples.The filter-based classification model includes residual network feature extraction and classification modules,which are iteratively optimized by an adversarial training strategy.Three mainstream adversarial attack methods are improved,and adversarial samples are generated on the Mini-ImageNet dataset.Subsequently,these samples are used to attack the EfficientNet and the filter-based classification model respectively,and the attack effects are compared.Experimental results show that the filter-based classification model has high classification accuracy when dealing with Mini-ImageNet adversarial examples.Adversarial training can effectively enhance the robustness of deep neural network models.