Based on wavelet packet transformation(WPT), genetic algorithm(GA), back propagation neural network(BPNN)and support vector machine(SVM), a fault diagnosis method of diesel engine valve clearance is presented. With po...Based on wavelet packet transformation(WPT), genetic algorithm(GA), back propagation neural network(BPNN)and support vector machine(SVM), a fault diagnosis method of diesel engine valve clearance is presented. With power spectral density analysis, the characteristic frequency related to the engine running conditions can be extracted from vibration signals. The biggest singular values(BSV)of wavelet coefficients and root mean square(RMS)values of vibration in characteristic frequency sub-bands are extracted at the end of third level decomposition of vibration signals, and they are used as input vectors of BPNN or SVM. To avoid being trapped in local minima, GA is adopted. The normal and fault vibration signals measured in different valve clearance conditions are analyzed. BPNN, GA back propagation neural network(GA-BPNN), SVM and GA-SVM are applied to the training and testing for the extraction of different features, and the classification accuracies and training time are compared to determine the optimum fault classifier and feature selection. Experimental results demonstrate that the proposed features and classification algorithms give classification accuracy of 100%.展开更多
More and more uncertain factors in power systems and more and more complex operation modes of power systems put forward higher requirements for online transient stability assessment methods.The traditional modeldriven...More and more uncertain factors in power systems and more and more complex operation modes of power systems put forward higher requirements for online transient stability assessment methods.The traditional modeldriven methods have clear physical mechanisms and reliable evaluation results but the calculation process is time-consuming,while the data-driven methods have the strong fitting ability and fast calculation speed but the evaluation results lack interpretation.Therefore,it is a future development trend of transient stability assessment methods to combine these two kinds of methods.In this paper,the rate of change of the kinetic energy method is used to calculate the transient stability in the model-driven stage,and the support vector machine and extreme learning machine with different internal principles are respectively used to predict the transient stability in the data-driven stage.In order to quantify the credibility level of the data-driven methods,the credibility index of the output results is proposed.Then the switching function controlling whether the rate of change of the kinetic energy method is activated or not is established based on this index.Thus,a newparallel integratedmodel-driven and datadriven online transient stability assessment method is proposed.The accuracy,efficiency,and adaptability of the proposed method are verified by numerical examples.展开更多
Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on S...Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.展开更多
Online fault detection is one of the key technologies to improve the performance of cloud systems. The current data of cloud systems is to be monitored, collected and used to reflect their state. Its use can potential...Online fault detection is one of the key technologies to improve the performance of cloud systems. The current data of cloud systems is to be monitored, collected and used to reflect their state. Its use can potentially help cloud managers take some timely measures before fault occurrence in clouds. Because of the complex structure and dynamic change characteristics of the clouds, existing fault detection methods suffer from the problems of low efficiency and low accuracy. In order to solve them, this work proposes an online detection model based on asystematic parameter-search method called SVM-Grid, whose construction is based on a support vector machine(SVM). SVM-Grid is used to optimize parameters in SVM. Proper attributes of a cloud system's running data are selected by using Pearson correlation and principal component analysis for the model. Strategies of predicting cloud faults and updating fault sample databases are proposed to optimize the model and improve its performance.In comparison with some representative existing methods, the proposed model can achieve more efficient and accurate fault detection for cloud systems.展开更多
A method for gearbox fault diagnosis consists of feature extraction andfault identification. Many methods for feature extraction have beendevised for exposing nature of vibration data of a defective gearbox. Inadditio...A method for gearbox fault diagnosis consists of feature extraction andfault identification. Many methods for feature extraction have beendevised for exposing nature of vibration data of a defective gearbox. Inaddition, features extracted from gearbox vibration data are identifiedby various classifiers. However, existing literatures leave much to bedesired in assessing performance of different combinatorial methods forgearbox fault diagnosis. To this end, this paper evaluated performance ofseveral typical combinatorial methods for gearbox fault diagnosis byassociating each of multifractal detrended fluctuation analysis (MFDFA),empirical mode decomposition (EMD) and wavelet transform (WT) witheach of neural network (NN), Mahalanobis distance decision rules(MDDR) and support vector machine (SVM). Following this,performance of different combinatorial methods was compared using agroup of gearbox vibration data containing slightly different faultpatterns. The results indicate that MFDFA performs better in featureextraction of gearbox vibration data and SVM does the same in faultidentification. Naturally, the method associating MFDFA with SVMshows huge potential for fault diagnosis of gearboxes. As a result, thispaper can provide some useful information on construction of a methodfor gearbox fault diagnosis.展开更多
There are four serious problems in the discriminant analysis. We developed an optimal linear discriminant function (optimal LDF) based on the minimum number of misclassification (minimum NM) using integer programm...There are four serious problems in the discriminant analysis. We developed an optimal linear discriminant function (optimal LDF) based on the minimum number of misclassification (minimum NM) using integer programming (IP). We call this LDF as Revised IP-OLDF. Only this LDF can discriminate the cases on the discriminant hyperplane (Probleml). This LDF and a hard-margin SVM (H-SVM) can discriminate the lineary separable data (LSD) exactly. Another LDFs may not discriminate the LSD theoretically (Problem2). When Revised IP-OLDF discriminate the Swiss banknote data with six variables, we find MNM of two-variables model such as (X4, X6) is zero. Because MNMk decreases monotounusly (MNMk 〉= MNM(k+1)), sixteen MNMs including (X4, X6) are zero. Until now, because there is no research of the LSD, we surveyed another three linear separable data sets such as: 18 exam scores data sets, the Japanese 44 cars data and six microarray datasets. When we discriminate the exam scores with MNM=0, we find the generalized inverse matrix technique causes the serious Problem3 and confirmed this fact by the cars data. At last, we claim the discriminant analysis is not the inferential statistics because there is no standard errors (SEs) of error rates and discriminant coefficients (Problem4). Therefore, we poroposed the "100-fold cross validation for the small sample" method (the method). By this break-through, we can choose the best model having minimum mean of error rate (M2) in the validation sample and obtaine two 95% confidence intervals (CIs) of error rate and discriminant coefficients. When we discriminate the exam scores by this new method, we obtaine the surprising results seven LDFs except for Fisher's LDF are almost the same as the trivial LDFs. In this research, we discriminate the Japanese 44 cars data because we can discuss four problems. There are six independent variables to discriminate 29 regular cars and 15 small cars. This data is linear separable by the emission rate (X1) and the number of seats (X3). We examine the validity of the new model selection procedure of the discriminant analysis. We proposed the model with minimum mean of error rates (M2) in the validation samples is the best model. We had examined this procedure by the exam scores, and we obtain good results. Moreover, the 95% CI of eight LDFs offers us real perception of the discriminant theory. However, the exam scores are different from the ordinal data. Therefore, we apply our theory and procedure to the Japanese 44 cars data and confirmed the same conclution.展开更多
Chloroplasts are organelles found in plant cells that conduct photosynthesis. The subchloroplast locations of proteins are correlated with their functions. With the availability of a great number of protein data, it i...Chloroplasts are organelles found in plant cells that conduct photosynthesis. The subchloroplast locations of proteins are correlated with their functions. With the availability of a great number of protein data, it is highly desired to develop a com- putational method to predict the subchloroplast locations of chloroplast proteins. In this study, we proposed a novel method to predict subchloroplast locations of proteins using tripeptide compositions. It first used the binomial distribution to optimize the feature sets. Then the support vector machine was selected to perform the prediction of subchloroplast locations of proteins. The proposed method was tested on a reliable and rigorous dataset including 259 chloroplast proteins with sequence identity ≤ 25%. In the jack-knife cross-validation, 92.21% envelope proteins, 93.20% thylakoid mem- brane, 52.63% thylakoid lumen and 85.00% stroma can be correctly identified. The overall accuracy achieves 88.03% which is higher than that of other models. Based on this method, a predictor called ChloPred has been built and can be freely available from http://cobi.uestc.edu.cn/people/hlin/tools/ChloPred/. The predictor will provide important information for theoretical and experimental research of chloroplast proteins.展开更多
基金Supported by the National Science and Technology Support Program of China(No.2015BAF07B04)
文摘Based on wavelet packet transformation(WPT), genetic algorithm(GA), back propagation neural network(BPNN)and support vector machine(SVM), a fault diagnosis method of diesel engine valve clearance is presented. With power spectral density analysis, the characteristic frequency related to the engine running conditions can be extracted from vibration signals. The biggest singular values(BSV)of wavelet coefficients and root mean square(RMS)values of vibration in characteristic frequency sub-bands are extracted at the end of third level decomposition of vibration signals, and they are used as input vectors of BPNN or SVM. To avoid being trapped in local minima, GA is adopted. The normal and fault vibration signals measured in different valve clearance conditions are analyzed. BPNN, GA back propagation neural network(GA-BPNN), SVM and GA-SVM are applied to the training and testing for the extraction of different features, and the classification accuracies and training time are compared to determine the optimum fault classifier and feature selection. Experimental results demonstrate that the proposed features and classification algorithms give classification accuracy of 100%.
基金funded by the Science and Technology Project of State Grid Shanxi Electric Power Co.,Ltd.(Project No.520530200013).
文摘More and more uncertain factors in power systems and more and more complex operation modes of power systems put forward higher requirements for online transient stability assessment methods.The traditional modeldriven methods have clear physical mechanisms and reliable evaluation results but the calculation process is time-consuming,while the data-driven methods have the strong fitting ability and fast calculation speed but the evaluation results lack interpretation.Therefore,it is a future development trend of transient stability assessment methods to combine these two kinds of methods.In this paper,the rate of change of the kinetic energy method is used to calculate the transient stability in the model-driven stage,and the support vector machine and extreme learning machine with different internal principles are respectively used to predict the transient stability in the data-driven stage.In order to quantify the credibility level of the data-driven methods,the credibility index of the output results is proposed.Then the switching function controlling whether the rate of change of the kinetic energy method is activated or not is established based on this index.Thus,a newparallel integratedmodel-driven and datadriven online transient stability assessment method is proposed.The accuracy,efficiency,and adaptability of the proposed method are verified by numerical examples.
文摘Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.
基金supported by the National Natural Science Foundation of China(61472005,61201252)CERNET Innovation Project(NGII20160207)
文摘Online fault detection is one of the key technologies to improve the performance of cloud systems. The current data of cloud systems is to be monitored, collected and used to reflect their state. Its use can potentially help cloud managers take some timely measures before fault occurrence in clouds. Because of the complex structure and dynamic change characteristics of the clouds, existing fault detection methods suffer from the problems of low efficiency and low accuracy. In order to solve them, this work proposes an online detection model based on asystematic parameter-search method called SVM-Grid, whose construction is based on a support vector machine(SVM). SVM-Grid is used to optimize parameters in SVM. Proper attributes of a cloud system's running data are selected by using Pearson correlation and principal component analysis for the model. Strategies of predicting cloud faults and updating fault sample databases are proposed to optimize the model and improve its performance.In comparison with some representative existing methods, the proposed model can achieve more efficient and accurate fault detection for cloud systems.
基金supported by Shandong ProvincialNatural Science Foundation China (ZR2012EEL07).
文摘A method for gearbox fault diagnosis consists of feature extraction andfault identification. Many methods for feature extraction have beendevised for exposing nature of vibration data of a defective gearbox. Inaddition, features extracted from gearbox vibration data are identifiedby various classifiers. However, existing literatures leave much to bedesired in assessing performance of different combinatorial methods forgearbox fault diagnosis. To this end, this paper evaluated performance ofseveral typical combinatorial methods for gearbox fault diagnosis byassociating each of multifractal detrended fluctuation analysis (MFDFA),empirical mode decomposition (EMD) and wavelet transform (WT) witheach of neural network (NN), Mahalanobis distance decision rules(MDDR) and support vector machine (SVM). Following this,performance of different combinatorial methods was compared using agroup of gearbox vibration data containing slightly different faultpatterns. The results indicate that MFDFA performs better in featureextraction of gearbox vibration data and SVM does the same in faultidentification. Naturally, the method associating MFDFA with SVMshows huge potential for fault diagnosis of gearboxes. As a result, thispaper can provide some useful information on construction of a methodfor gearbox fault diagnosis.
文摘There are four serious problems in the discriminant analysis. We developed an optimal linear discriminant function (optimal LDF) based on the minimum number of misclassification (minimum NM) using integer programming (IP). We call this LDF as Revised IP-OLDF. Only this LDF can discriminate the cases on the discriminant hyperplane (Probleml). This LDF and a hard-margin SVM (H-SVM) can discriminate the lineary separable data (LSD) exactly. Another LDFs may not discriminate the LSD theoretically (Problem2). When Revised IP-OLDF discriminate the Swiss banknote data with six variables, we find MNM of two-variables model such as (X4, X6) is zero. Because MNMk decreases monotounusly (MNMk 〉= MNM(k+1)), sixteen MNMs including (X4, X6) are zero. Until now, because there is no research of the LSD, we surveyed another three linear separable data sets such as: 18 exam scores data sets, the Japanese 44 cars data and six microarray datasets. When we discriminate the exam scores with MNM=0, we find the generalized inverse matrix technique causes the serious Problem3 and confirmed this fact by the cars data. At last, we claim the discriminant analysis is not the inferential statistics because there is no standard errors (SEs) of error rates and discriminant coefficients (Problem4). Therefore, we poroposed the "100-fold cross validation for the small sample" method (the method). By this break-through, we can choose the best model having minimum mean of error rate (M2) in the validation sample and obtaine two 95% confidence intervals (CIs) of error rate and discriminant coefficients. When we discriminate the exam scores by this new method, we obtaine the surprising results seven LDFs except for Fisher's LDF are almost the same as the trivial LDFs. In this research, we discriminate the Japanese 44 cars data because we can discuss four problems. There are six independent variables to discriminate 29 regular cars and 15 small cars. This data is linear separable by the emission rate (X1) and the number of seats (X3). We examine the validity of the new model selection procedure of the discriminant analysis. We proposed the model with minimum mean of error rates (M2) in the validation samples is the best model. We had examined this procedure by the exam scores, and we obtain good results. Moreover, the 95% CI of eight LDFs offers us real perception of the discriminant theory. However, the exam scores are different from the ordinal data. Therefore, we apply our theory and procedure to the Japanese 44 cars data and confirmed the same conclution.
文摘Chloroplasts are organelles found in plant cells that conduct photosynthesis. The subchloroplast locations of proteins are correlated with their functions. With the availability of a great number of protein data, it is highly desired to develop a com- putational method to predict the subchloroplast locations of chloroplast proteins. In this study, we proposed a novel method to predict subchloroplast locations of proteins using tripeptide compositions. It first used the binomial distribution to optimize the feature sets. Then the support vector machine was selected to perform the prediction of subchloroplast locations of proteins. The proposed method was tested on a reliable and rigorous dataset including 259 chloroplast proteins with sequence identity ≤ 25%. In the jack-knife cross-validation, 92.21% envelope proteins, 93.20% thylakoid mem- brane, 52.63% thylakoid lumen and 85.00% stroma can be correctly identified. The overall accuracy achieves 88.03% which is higher than that of other models. Based on this method, a predictor called ChloPred has been built and can be freely available from http://cobi.uestc.edu.cn/people/hlin/tools/ChloPred/. The predictor will provide important information for theoretical and experimental research of chloroplast proteins.