Coal mines require various kinds of machinery. The fault diagnosis of this equipment has a great impact on mine production. The problem of incorrect classification of noisy data by traditional support vector machines ...Coal mines require various kinds of machinery. The fault diagnosis of this equipment has a great impact on mine production. The problem of incorrect classification of noisy data by traditional support vector machines is addressed by a proposed Probability Least Squares Support Vector Classification Machine (PLSSVCM). Samples that cannot be definitely determined as belonging to one class will be assigned to a class by the PLSSVCM based on a probability value. This gives the classification results both a qualitative explanation and a quantitative evaluation. Simulation results of a fault diagnosis show that the correct rate of the PLSSVCM is 100%. Even though samples are noisy, the PLSSVCM still can effectively realize multi-class fault diagnosis of a roller bearing. The generalization property of the PLSSVCM is better than that of a neural network and a LSSVCM.展开更多
A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi- cation. Since the objective function of the model for an unstrained semi-supervised vector machin...A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi- cation. Since the objective function of the model for an unstrained semi-supervised vector machine is not smooth, many fast opti- mization algorithms cannot be applied to solve the model. In order to overcome the difficulty of dealing with non-smooth objective functions, new methods that can solve the semi-supervised vector machine with desired classification accuracy are in great demand. A quintic spline function with three-times differentiability at the ori- gin is constructed by a general three-moment method, which can be used to approximate the symmetric hinge loss function. The approximate accuracy of the quintic spiine function is estimated. Moreover, a quintic spline smooth semi-support vector machine is obtained and the convergence accuracy of the smooth model to the non-smooth one is analyzed. Three experiments are performed to test the efficiency of the model. The experimental results show that the new model outperforms other smooth models, in terms of classification performance. Furthermore, the new model is not sensitive to the increasing number of the labeled samples, which means that the new model is more efficient.展开更多
Maintenance operations have a critical influence on power gen-eration by wind turbines(WT).Advanced algorithms must analyze large volume of data from condition monitoring systems(CMS)to determine the actual working co...Maintenance operations have a critical influence on power gen-eration by wind turbines(WT).Advanced algorithms must analyze large volume of data from condition monitoring systems(CMS)to determine the actual working conditions and avoid false alarms.This paper proposes different support vector machine(SVM)algorithms for the prediction and detection of false alarms.K-Fold cross-validation(CV)is applied to evaluate the classification reliability of these algorithms.Supervisory Control and Data Acquisition(SCADA)data from an operating WT are applied to test the proposed approach.The results from the quadratic SVM showed an accuracy rate of 98.6%.Misclassifications from the confusion matrix,alarm log and maintenance records are analyzed to obtain quantitative information and determine if it is a false alarm.The classifier reduces the number of false alarms called misclassifications by 25%.These results demonstrate that the proposed approach presents high reliability and accuracy in false alarm identification.展开更多
The paper is related to the error analysis of Multicategory Support Vector Machine (MSVM) classifiers based on reproducing kernel Hilbert spaces. We choose the polynomial kernel as Mercer kernel and give the error e...The paper is related to the error analysis of Multicategory Support Vector Machine (MSVM) classifiers based on reproducing kernel Hilbert spaces. We choose the polynomial kernel as Mercer kernel and give the error estimate with De La Vall6e Poussin means. We also introduce the standard estimation of sample error, and derive the explicit learning rate.展开更多
A comprehensive assessment of the spatial.aware mpervised learning algorithms for hyper.spectral image (HSI) classification was presented. For this purpose, standard support vector machines ( SVMs ), mudttnomial l...A comprehensive assessment of the spatial.aware mpervised learning algorithms for hyper.spectral image (HSI) classification was presented. For this purpose, standard support vector machines ( SVMs ), mudttnomial logistic regression ( MLR ) and sparse representation (SR) based supervised learning algorithm were compared both theoretically and experimentally. Performance of the discussed techniques was evaluated in terms of overall accuracy, average accuracy, kappa statistic coefficients, and sparsity of the solutions. Execution time, the computational burden, and the capability of the methods were investigated by using probabilistie analysis. For validating the accuracy a classical benchmark AVIRIS Indian pines data set was used. Experiments show that integrating spectral.spatial context can further improve the accuracy, reduce the misclassltication error although the cost of computational time will be increased.展开更多
There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR)...There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.展开更多
Machine learning and artificial intelligence approaches have rapidly gained popularity for use in many subsurface energy applications.They are seen as novel methods that may enhance existing capabilities,providing for...Machine learning and artificial intelligence approaches have rapidly gained popularity for use in many subsurface energy applications.They are seen as novel methods that may enhance existing capabilities,providing for improved efficiency in exploration and production operations.Furthermore,their inte-gration into reservoir management workflows may shape the future landscape of the energy industry.This study implements a framework that generates predictive models using multiple machine learning classification-based algorithms which can identify specific stratigraphic units(i.e.,formations)as a function of total vertical depth and spatial positioning.The framework is applied in a case study to 13 specific formations of interest(Upper Spraberry through Atoka/Morrow reservoirs)in the Midland Basin,West Texas,United States;a prominent hydrocarbon producing sub-basin of the larger Permian Basin.The study dataset consists of over 275,000 records and includes data fields like formation iden-tifier,true vertical depth(in feet)of formations observed,and latitude and longitude coordinates(in decimal degrees).A subset of 134,374 data records were relevant to the 13 distinct formations of interest and were extracted and used for machine learning model training,validation,and testing.Four super-vised learning approaches including random forest(RF),gradient boosting(GB),support vector machine(SVM),and multilayer perceptron neural network(MLP)were evaluated and their prediction accuracy compared.The best performing model was ultimately built on the RF algorithm and is capable of an overall prediction accuracy of 93 percent on holdout data.The RF-based model demonstrated high prediction accuracy for major oil and gas producing zones including the San Andres,Upper Spraberry,Lower Spraberry,Clearfork,and Wolfcamp at 98,94,89,94,and 94 percent respectively.Overall,the resulting data-driven model provides a robust,cost-effective approach which can complement contemporary reservoir management approaches for multiple subsurface energy applications.展开更多
The Extreme Learning Machine(ELM) is an effective learning algorithm for a Single-Layer Feedforward Network(SLFN). It performs well in managing some problems due to its fast learning speed. However, in practical a...The Extreme Learning Machine(ELM) is an effective learning algorithm for a Single-Layer Feedforward Network(SLFN). It performs well in managing some problems due to its fast learning speed. However, in practical applications, its performance might be affected by the noise in the training data. To tackle the noise issue, we propose a novel heterogeneous ensemble of ELMs in this article. Specifically, the correntropy is used to achieve insensitive performance to outliers, while implementing Negative Correlation Learning(NCL) to enhance diversity among the ensemble. The proposed Heterogeneous Ensemble of ELMs(HE2 LM) for classification has different ELM algorithms including the Regularized ELM(RELM), the Kernel ELM(KELM), and the L2-norm-optimized ELM(ELML2). The ensemble is constructed by training a randomly selected ELM classifier on a subset of the training data selected through random resampling. Then, the class label of unseen data is predicted using a maximum weighted sum approach. After splitting the training data into subsets, the proposed HE2 LM is tested through classification and regression tasks on real-world benchmark datasets and synthetic datasets. Hence, the simulation results show that compared with other algorithms, our proposed method can achieve higher prediction accuracy, better generalization, and less sensitivity to outliers.展开更多
The condition of the road infrastructure has severe impacts on the road safety, driving comfort, and on the rolling resistance. Therefore, the road infrastructure must be moni- tored comprehensively and in regular int...The condition of the road infrastructure has severe impacts on the road safety, driving comfort, and on the rolling resistance. Therefore, the road infrastructure must be moni- tored comprehensively and in regular intervals to identify damaged road segments and road hazards. Methods have been developed to comprehensively and automatically digitize the road infrastructure and estimate the road quality, which are based on vehicle sensors and a supervised machine learning classification. Since different types of vehicles have various suspension systems with different response functions, one classifier cannot be taken over to other vehicles. Usually, a high amount of time is needed to acquire training data for each individual vehicle and classifier. To address this problem, the methods to collect training data automatically for new vehicles based on the comparison of trajectories of untrained and trained vehicles have been developed. The results show that the method based on a k-dimensional tree and Euclidean distance performs best and is robust in transferring the information of the road surface from one vehicle to another. Furthermore, this method offers the possibility to merge the output and road infrastructure information from multiple vehicles to enable a more robust and precise prediction of the ground truth.展开更多
基金supported by the Program for New Century Excellent Talents in University (NoNCET- 08-0836)the National Natural Science Foundation of China (Nos60804022, 60974050 and 61072094)+1 种基金the Fok Ying-Tung Education Foundation for Young Teachers (No121066)by the Natural Science Foundation of Jiangsu Province (No.BK2008126)
文摘Coal mines require various kinds of machinery. The fault diagnosis of this equipment has a great impact on mine production. The problem of incorrect classification of noisy data by traditional support vector machines is addressed by a proposed Probability Least Squares Support Vector Classification Machine (PLSSVCM). Samples that cannot be definitely determined as belonging to one class will be assigned to a class by the PLSSVCM based on a probability value. This gives the classification results both a qualitative explanation and a quantitative evaluation. Simulation results of a fault diagnosis show that the correct rate of the PLSSVCM is 100%. Even though samples are noisy, the PLSSVCM still can effectively realize multi-class fault diagnosis of a roller bearing. The generalization property of the PLSSVCM is better than that of a neural network and a LSSVCM.
基金supported by the Fundamental Research Funds for University of Science and Technology Beijing(FRF-BR-12-021)
文摘A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi- cation. Since the objective function of the model for an unstrained semi-supervised vector machine is not smooth, many fast opti- mization algorithms cannot be applied to solve the model. In order to overcome the difficulty of dealing with non-smooth objective functions, new methods that can solve the semi-supervised vector machine with desired classification accuracy are in great demand. A quintic spline function with three-times differentiability at the ori- gin is constructed by a general three-moment method, which can be used to approximate the symmetric hinge loss function. The approximate accuracy of the quintic spiine function is estimated. Moreover, a quintic spline smooth semi-support vector machine is obtained and the convergence accuracy of the smooth model to the non-smooth one is analyzed. Three experiments are performed to test the efficiency of the model. The experimental results show that the new model outperforms other smooth models, in terms of classification performance. Furthermore, the new model is not sensitive to the increasing number of the labeled samples, which means that the new model is more efficient.
基金supported financially by the Ministerio de Ciencia e Innovación(Spain)and the European Regional Development Fund under the Research Grant WindSound Project(Ref.:PID2021-125278OB-I00).
文摘Maintenance operations have a critical influence on power gen-eration by wind turbines(WT).Advanced algorithms must analyze large volume of data from condition monitoring systems(CMS)to determine the actual working conditions and avoid false alarms.This paper proposes different support vector machine(SVM)algorithms for the prediction and detection of false alarms.K-Fold cross-validation(CV)is applied to evaluate the classification reliability of these algorithms.Supervisory Control and Data Acquisition(SCADA)data from an operating WT are applied to test the proposed approach.The results from the quadratic SVM showed an accuracy rate of 98.6%.Misclassifications from the confusion matrix,alarm log and maintenance records are analyzed to obtain quantitative information and determine if it is a false alarm.The classifier reduces the number of false alarms called misclassifications by 25%.These results demonstrate that the proposed approach presents high reliability and accuracy in false alarm identification.
文摘The paper is related to the error analysis of Multicategory Support Vector Machine (MSVM) classifiers based on reproducing kernel Hilbert spaces. We choose the polynomial kernel as Mercer kernel and give the error estimate with De La Vall6e Poussin means. We also introduce the standard estimation of sample error, and derive the explicit learning rate.
基金National Key Research and Development Program of China(No.2016YFF0103604)National Natural Science Foundations of China(Nos.61171165,11431015,61571230)+1 种基金National Scientific Equipment Developing Project of China(No.2012YQ050250)Natural Science Foundation of Jiangsu Province,China(No.BK20161500)
文摘A comprehensive assessment of the spatial.aware mpervised learning algorithms for hyper.spectral image (HSI) classification was presented. For this purpose, standard support vector machines ( SVMs ), mudttnomial logistic regression ( MLR ) and sparse representation (SR) based supervised learning algorithm were compared both theoretically and experimentally. Performance of the discussed techniques was evaluated in terms of overall accuracy, average accuracy, kappa statistic coefficients, and sparsity of the solutions. Execution time, the computational burden, and the capability of the methods were investigated by using probabilistie analysis. For validating the accuracy a classical benchmark AVIRIS Indian pines data set was used. Experiments show that integrating spectral.spatial context can further improve the accuracy, reduce the misclassltication error although the cost of computational time will be increased.
基金supported by the National Science and Technology Major Project of the Ministry of Science and Technology of China(2014 ZX03001027)
文摘There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.
文摘Machine learning and artificial intelligence approaches have rapidly gained popularity for use in many subsurface energy applications.They are seen as novel methods that may enhance existing capabilities,providing for improved efficiency in exploration and production operations.Furthermore,their inte-gration into reservoir management workflows may shape the future landscape of the energy industry.This study implements a framework that generates predictive models using multiple machine learning classification-based algorithms which can identify specific stratigraphic units(i.e.,formations)as a function of total vertical depth and spatial positioning.The framework is applied in a case study to 13 specific formations of interest(Upper Spraberry through Atoka/Morrow reservoirs)in the Midland Basin,West Texas,United States;a prominent hydrocarbon producing sub-basin of the larger Permian Basin.The study dataset consists of over 275,000 records and includes data fields like formation iden-tifier,true vertical depth(in feet)of formations observed,and latitude and longitude coordinates(in decimal degrees).A subset of 134,374 data records were relevant to the 13 distinct formations of interest and were extracted and used for machine learning model training,validation,and testing.Four super-vised learning approaches including random forest(RF),gradient boosting(GB),support vector machine(SVM),and multilayer perceptron neural network(MLP)were evaluated and their prediction accuracy compared.The best performing model was ultimately built on the RF algorithm and is capable of an overall prediction accuracy of 93 percent on holdout data.The RF-based model demonstrated high prediction accuracy for major oil and gas producing zones including the San Andres,Upper Spraberry,Lower Spraberry,Clearfork,and Wolfcamp at 98,94,89,94,and 94 percent respectively.Overall,the resulting data-driven model provides a robust,cost-effective approach which can complement contemporary reservoir management approaches for multiple subsurface energy applications.
基金supported by the National Natural Science Foundation of China(Nos.61174103 and61603032)the National Key Technologies R&D Program of China(No.2015BAK38B01)+2 种基金the National Key Research and Development Program of China(No.2017YFB0702300)the China Postdoctoral Science Foundation(No.2016M590048)the University of Science and Technology Beijing–Taipei University of Technology Joint Research Program(TW201705)
文摘The Extreme Learning Machine(ELM) is an effective learning algorithm for a Single-Layer Feedforward Network(SLFN). It performs well in managing some problems due to its fast learning speed. However, in practical applications, its performance might be affected by the noise in the training data. To tackle the noise issue, we propose a novel heterogeneous ensemble of ELMs in this article. Specifically, the correntropy is used to achieve insensitive performance to outliers, while implementing Negative Correlation Learning(NCL) to enhance diversity among the ensemble. The proposed Heterogeneous Ensemble of ELMs(HE2 LM) for classification has different ELM algorithms including the Regularized ELM(RELM), the Kernel ELM(KELM), and the L2-norm-optimized ELM(ELML2). The ensemble is constructed by training a randomly selected ELM classifier on a subset of the training data selected through random resampling. Then, the class label of unseen data is predicted using a maximum weighted sum approach. After splitting the training data into subsets, the proposed HE2 LM is tested through classification and regression tasks on real-world benchmark datasets and synthetic datasets. Hence, the simulation results show that compared with other algorithms, our proposed method can achieve higher prediction accuracy, better generalization, and less sensitivity to outliers.
基金project of Technical Aspects of Monitoring the Acoustic Quality of Infrastructure in Road Transport(3714541000)commissioned by the German Federal Environment Agencyfunded by the Federal Ministry for the Environment,Nature Conservation,Building and Nuclear Safety,Germany,within the Environmental Research Plan 2014.
文摘The condition of the road infrastructure has severe impacts on the road safety, driving comfort, and on the rolling resistance. Therefore, the road infrastructure must be moni- tored comprehensively and in regular intervals to identify damaged road segments and road hazards. Methods have been developed to comprehensively and automatically digitize the road infrastructure and estimate the road quality, which are based on vehicle sensors and a supervised machine learning classification. Since different types of vehicles have various suspension systems with different response functions, one classifier cannot be taken over to other vehicles. Usually, a high amount of time is needed to acquire training data for each individual vehicle and classifier. To address this problem, the methods to collect training data automatically for new vehicles based on the comparison of trajectories of untrained and trained vehicles have been developed. The results show that the method based on a k-dimensional tree and Euclidean distance performs best and is robust in transferring the information of the road surface from one vehicle to another. Furthermore, this method offers the possibility to merge the output and road infrastructure information from multiple vehicles to enable a more robust and precise prediction of the ground truth.