Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclu...Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclusion of a large number of data patterns.However,these data-driven models are vulnerable to adversarial attacks;thus,small perturbations on the samples can cause the models to provide incorrect fault predictions.Several recent studies have demonstrated the vulnerability of machine learning methods and the existence of adversarial samples.This paper proposes a black-box attack method with an extreme constraint for a safe-critical industrial fault classification system:Only one variable can be perturbed to craft adversarial samples.Moreover,to hide the adversarial samples in the visualization space,a Jacobian matrix is used to guide the perturbed variable selection,making the adversarial samples in the dimensional reduction space invisible to the human eye.Using the one-variable attack(OVA)method,we explore the vulnerability of industrial variables and fault types,which can help understand the geometric characteristics of fault classification systems.Based on the attack method,a corresponding adversarial training defense method is also proposed,which efficiently defends against an OVA and improves the prediction accuracy of the classifiers.In experiments,the proposed method was tested on two datasets from the Tennessee–Eastman process(TEP)and steel plates(SP).We explore the vulnerability and correlation within variables and faults and verify the effectiveness of OVAs and defenses for various classifiers and datasets.For industrial fault classification systems,the attack success rate of our method is close to(on TEP)or even higher than(on SP)the current most effective first-order white-box attack method,which requires perturbation of all variables.展开更多
基金This work was supported in part by the National Natural Science Foundation of China(NSFC)(92167106,62103362,and 61833014)the Natural Science Foundation of Zhejiang Province(LR18F030001).
文摘Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclusion of a large number of data patterns.However,these data-driven models are vulnerable to adversarial attacks;thus,small perturbations on the samples can cause the models to provide incorrect fault predictions.Several recent studies have demonstrated the vulnerability of machine learning methods and the existence of adversarial samples.This paper proposes a black-box attack method with an extreme constraint for a safe-critical industrial fault classification system:Only one variable can be perturbed to craft adversarial samples.Moreover,to hide the adversarial samples in the visualization space,a Jacobian matrix is used to guide the perturbed variable selection,making the adversarial samples in the dimensional reduction space invisible to the human eye.Using the one-variable attack(OVA)method,we explore the vulnerability of industrial variables and fault types,which can help understand the geometric characteristics of fault classification systems.Based on the attack method,a corresponding adversarial training defense method is also proposed,which efficiently defends against an OVA and improves the prediction accuracy of the classifiers.In experiments,the proposed method was tested on two datasets from the Tennessee–Eastman process(TEP)and steel plates(SP).We explore the vulnerability and correlation within variables and faults and verify the effectiveness of OVAs and defenses for various classifiers and datasets.For industrial fault classification systems,the attack success rate of our method is close to(on TEP)or even higher than(on SP)the current most effective first-order white-box attack method,which requires perturbation of all variables.