Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challen...Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challenge to machine learning algorithms.To tackle such issues,we propose a new splitting criterion of the decision tree based on the one-against-all-based Hellinger distance(OAHD).Two crucial elements are included in OAHD.First,the one-against-all scheme is integrated into the process of computing the Hellinger distance in OAHD,thereby extending the Hellinger distance decision tree to cope with the multiclass imbalance problem.Second,for the multiclass imbalance problem,the distribution and the number of distinct classes are taken into account,and a modified Gini index is designed.Moreover,we give theoretical proofs for the properties of OAHD,including skew insensitivity and the ability to seek a purer node in the decision tree.Finally,we collect 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning(KEEL)repository and the University of California,Irvine(UCI)repository.Experimental and statistical results show that OAHD significantly improves the performance compared with the five other well-known decision trees in terms of Precision,F-measure,and multiclass area under the receiver operating characteristic curve(MAUC).Moreover,through statistical analysis,the Friedman and Nemenyi tests are used to prove the advantage of OAHD over the five other decision trees.展开更多
Analog circuits fault diagnosis is essential for guaranteeing the reliability and maintainability of electronic systems. In this paper, a novel analog circuit fault diagnosis approach is proposed based on greedy kerne...Analog circuits fault diagnosis is essential for guaranteeing the reliability and maintainability of electronic systems. In this paper, a novel analog circuit fault diagnosis approach is proposed based on greedy kernel principal component analysis (KPCA) and one-against-all support vector machine (OAASVM). In order to obtain a successful SVM-based fault classifier, eliminating noise and extracting fault features are very important. Due to the better performance of nonlinear fault features extraction and noise elimination as compared with PCA, KPCA is adopted in the proposed approach. However, when we adopt KPCA to extract fault features of analog circuit, a drawback of KPCA is that the storage required for the kernel matrix grows quadratically, and the computational cost for eigenvector of the kernel matrix grows linearly with the number of training samples. Therefore, GKPCA, which can approximate KPCA with small representation error, is introduced to enhance computational efficiency. Based on the statistical learning theory and the empirical risk minimization principle, SVM has advantages of better classification accuracy and generalization performance. The extracted fault features are then used as the inputs of OAASVM to solve fault diagnosis problem. The effectiveness of the proposed approach is verified by the experimental results.展开更多
基金Project supported by the National Natural Science Foundation of China(Nos.61802085 and 61563012)the Guangxi Provincial Natural Science Foundation,China(Nos.2021GXNSFAA220074and 2020GXNSFAA159038)+1 种基金the Guangxi Key Laboratory of Embedded Technology and Intelligent System Foundation,China(No.2018A-04)the Guangxi Key Laboratory of Trusted Software Foundation,China(No.kx202011)。
文摘Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challenge to machine learning algorithms.To tackle such issues,we propose a new splitting criterion of the decision tree based on the one-against-all-based Hellinger distance(OAHD).Two crucial elements are included in OAHD.First,the one-against-all scheme is integrated into the process of computing the Hellinger distance in OAHD,thereby extending the Hellinger distance decision tree to cope with the multiclass imbalance problem.Second,for the multiclass imbalance problem,the distribution and the number of distinct classes are taken into account,and a modified Gini index is designed.Moreover,we give theoretical proofs for the properties of OAHD,including skew insensitivity and the ability to seek a purer node in the decision tree.Finally,we collect 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning(KEEL)repository and the University of California,Irvine(UCI)repository.Experimental and statistical results show that OAHD significantly improves the performance compared with the five other well-known decision trees in terms of Precision,F-measure,and multiclass area under the receiver operating characteristic curve(MAUC).Moreover,through statistical analysis,the Friedman and Nemenyi tests are used to prove the advantage of OAHD over the five other decision trees.
基金Sponsored by the National Natural Science Foundation of China(Grant No. 61074127)
文摘Analog circuits fault diagnosis is essential for guaranteeing the reliability and maintainability of electronic systems. In this paper, a novel analog circuit fault diagnosis approach is proposed based on greedy kernel principal component analysis (KPCA) and one-against-all support vector machine (OAASVM). In order to obtain a successful SVM-based fault classifier, eliminating noise and extracting fault features are very important. Due to the better performance of nonlinear fault features extraction and noise elimination as compared with PCA, KPCA is adopted in the proposed approach. However, when we adopt KPCA to extract fault features of analog circuit, a drawback of KPCA is that the storage required for the kernel matrix grows quadratically, and the computational cost for eigenvector of the kernel matrix grows linearly with the number of training samples. Therefore, GKPCA, which can approximate KPCA with small representation error, is introduced to enhance computational efficiency. Based on the statistical learning theory and the empirical risk minimization principle, SVM has advantages of better classification accuracy and generalization performance. The extracted fault features are then used as the inputs of OAASVM to solve fault diagnosis problem. The effectiveness of the proposed approach is verified by the experimental results.