The support vector machine(SVM)is a classical machine learning method.Both the hinge loss and least absolute shrinkage and selection operator(LASSO)penalty are usually used in traditional SVMs.However,the hinge loss i...The support vector machine(SVM)is a classical machine learning method.Both the hinge loss and least absolute shrinkage and selection operator(LASSO)penalty are usually used in traditional SVMs.However,the hinge loss is not differentiable,and the LASSO penalty does not have the Oracle property.In this paper,the huberized loss is combined with non-convex penalties to obtain a model that has the advantages of both the computational simplicity and the Oracle property,contributing to higher accuracy than traditional SVMs.It is experimentally demonstrated that the two non-convex huberized-SVM methods,smoothly clipped absolute deviation huberized-SVM(SCAD-HSVM)and minimax concave penalty huberized-SVM(MCP-HSVM),outperform the traditional SVM method in terms of the prediction accuracy and classifier performance.They are also superior in terms of variable selection,especially when there is a high linear correlation between the variables.When they are applied to the prediction of listed companies,the variables that can affect and predict financial distress are accurately filtered out.Among all the indicators,the indicators per share have the greatest influence while those of solvency have the weakest influence.Listed companies can assess the financial situation with the indicators screened by our algorithm and make an early warning of their possible financial distress in advance with higher precision.展开更多
Hybrid data assimilation (DA) is a method seeing more use in recent hydrology and water resources research. In this study, a DA method coupled with the support vector machines (SVMs) and the ensemble Kalman filter...Hybrid data assimilation (DA) is a method seeing more use in recent hydrology and water resources research. In this study, a DA method coupled with the support vector machines (SVMs) and the ensemble Kalman filter (EnKF) technology was used for the prediction of soil moisture in different soil layers: 0-5 cm, 30 cm, 50 cm, 100 cm, 200 cm, and 300 cm. The SVM methodology was first used to train the ground measurements of soil moisture and meteorological parameters from the Meilin study area, in East China, to construct soil moisture statistical prediction models. Subsequent observations and their statistics were used for predictions, with two approaches: the SVM predictor and the SVM-EnKF model made by coupling the SVM model with the EnKF technique using the DA method. Validation results showed that the proposed SVM-EnKF model can improve the prediction results of soil moisture in different layers, from the surface to the root zone.展开更多
The sea surface temperature (SST) has substantial impacts on the climate; however, due to its highly nonlinear nature, evidently non-periodic and strongly stochastic properties, it is rather difficult to predict SST...The sea surface temperature (SST) has substantial impacts on the climate; however, due to its highly nonlinear nature, evidently non-periodic and strongly stochastic properties, it is rather difficult to predict SST. Here, the authors combine the complementary ensemble empirical mode decomposition (CEEMD) and support vector machine (SVM) methods to predict SST. Extensive tests from several different aspects are presented to validate the effectiveness of the CEEMD-SVM method. The results suggest that the new method works well in forecasting Northeast Pacific SST at a 12-month lead time, with an average absolute error of approximately 0.3℃ and a correlation coefficient of 0.85. Moreover, no spring predictability barrier is observed in our experiments.展开更多
A support vector machine (SVM) ensemble classifier is proposed. Performance of SVM trained in an input space eonsisting of all the information from many sources is not always good. The strategy that the original inp...A support vector machine (SVM) ensemble classifier is proposed. Performance of SVM trained in an input space eonsisting of all the information from many sources is not always good. The strategy that the original input space is partitioned into several input subspaces usually works for improving the performance. Different from conventional partition methods, the partition method used in this paper, rough sets theory based attribute reduction, allows the input subspaces partially overlapped. These input subspaces can offer complementary information about hidden data patterns. In every subspace, an SVM sub-classifier is learned. With the information fusion techniques, those SVM sub-classifiers with better performance are selected and combined to construct an SVM ensemble. The proposed method is applied to decision-making of medical diagnosis. Comparison of performance between our method and several other popular ensemble methods is done. Experimental results demonstrate that our proposed approach can make full use of the information contained in data and improve the decision-making performance.展开更多
In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According t...In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.展开更多
Landslide is a serious natural disaster next only to earthquake and flood,which will cause a great threat to people’s lives and property safety.The traditional research of landslide disaster based on experience-drive...Landslide is a serious natural disaster next only to earthquake and flood,which will cause a great threat to people’s lives and property safety.The traditional research of landslide disaster based on experience-driven or statistical model and its assessment results are subjective,difficult to quantify,and no pertinence.As a new research method for landslide susceptibility assessment,machine learning can greatly improve the landslide susceptibility model’s accuracy by constructing statistical models.Taking Western Henan for example,the study selected 16 landslide influencing factors such as topography,geological environment,hydrological conditions,and human activities,and 11 landslide factors with the most significant influence on the landslide were selected by the recursive feature elimination(RFE)method.Five machine learning methods[Support Vector Machines(SVM),Logistic Regression(LR),Random Forest(RF),Extreme Gradient Boosting(XGBoost),and Linear Discriminant Analysis(LDA)]were used to construct the spatial distribution model of landslide susceptibility.The models were evaluated by the receiver operating characteristic curve and statistical index.After analysis and comparison,the XGBoost model(AUC 0.8759)performed the best and was suitable for dealing with regression problems.The model had a high adaptability to landslide data.According to the landslide susceptibility map of the five models,the overall distribution can be observed.The extremely high and high susceptibility areas are distributed in the Funiu Mountain range in the southwest,the Xiaoshan Mountain range in the west,and the Yellow River Basin in the north.These areas have large terrain fluctuations,complicated geological structural environments and frequent human engineering activities.The extremely high and highly prone areas were 12043.3 km^(2)and 3087.45 km^(2),accounting for 47.61%and 12.20%of the total area of the study area,respectively.Our study reflects the distribution of landslide susceptibility in western Henan Province,which provides a scientific basis for regional disaster warning,prediction,and resource protection.The study has important practical significance for subsequent landslide disaster management.展开更多
The relationship among Mercer kernel, reproducing kernel and positive definite kernel in support vector machine (SVM) is proved and their roles in SVM are discussed. The quadratic form of the kernel matrix is used t...The relationship among Mercer kernel, reproducing kernel and positive definite kernel in support vector machine (SVM) is proved and their roles in SVM are discussed. The quadratic form of the kernel matrix is used to confirm the positive definiteness and their construction. Based on the Bochner theorem, some translation invariant kernels are checked in their Fourier domain. Some rotation invariant radial kernels are inspected according to the Schoenberg theorem. Finally, the construction of discrete scaling and wavelet kernels, the kernel selection and the kernel parameter learning are discussed.展开更多
Support vector machines (SVMs) have been introduced as effective methods for solving classification problems. However, due to some limitations in practical applications, their generalization performance is sometimes...Support vector machines (SVMs) have been introduced as effective methods for solving classification problems. However, due to some limitations in practical applications, their generalization performance is sometimes far from the expected level. Therefore, it is meaningful to study SVM ensemble learning. In this paper, a novel genetic algorithm based ensemble learning method, namely Direct Genetic Ensemble (DGE), is proposed. DGE adopts the predictive accuracy of ensemble as the fitness function and searches a good ensemble from the ensemble space. In essence, DGE is also a selective ensemble learning method because the base classifiers of the ensemble are selected according to the solution of genetic algorithm. In comparison with other ensemble learning methods, DGE works on a higher level and is more direct. Different strategies of constructing diverse base classifiers can be utilized in DGE. Experimental results show that SVM ensembles constructed by DGE can achieve better performance than single SVMs, hagged and boosted SVM ensembles. In addition, some valuable conclusions are obtained.展开更多
Urban living in large modern cities exerts considerable adverse effectson health and thus increases the risk of contracting several chronic kidney diseases (CKD). The prediction of CKDs has become a major task in urb...Urban living in large modern cities exerts considerable adverse effectson health and thus increases the risk of contracting several chronic kidney diseases (CKD). The prediction of CKDs has become a major task in urbanizedcountries. The primary objective of this work is to introduce and develop predictive analytics for predicting CKDs. However, prediction of huge samples isbecoming increasingly difficult. Meanwhile, MapReduce provides a feasible framework for programming predictive algorithms with map and reduce functions.The relatively simple programming interface helps solve problems in the scalability and efficiency of predictive learning algorithms. In the proposed work, theiterative weighted map reduce framework is introduced for the effective management of large dataset samples. A binary classification problem is formulated usingensemble nonlinear support vector machines and random forests. Thus, instead ofusing the normal linear combination of kernel activations, the proposed work creates nonlinear combinations of kernel activations in prototype examples. Furthermore, different descriptors are combined in an ensemble of deep support vectormachines, where the product rule is used to combine probability estimates ofdifferent classifiers. Performance is evaluated in terms of the prediction accuracyand interpretability of the model and the results.展开更多
In this work, a total of 322 tests were taken on young volunteers by performing 10 different falls, 6 different Activities of Daily Living (ADL) and 7 Dynamic Gait Index (DGI) tests using a custom-designed Wireless Ga...In this work, a total of 322 tests were taken on young volunteers by performing 10 different falls, 6 different Activities of Daily Living (ADL) and 7 Dynamic Gait Index (DGI) tests using a custom-designed Wireless Gait Analysis Sensor (WGAS). In order to perform automatic fall detection, we used Back Propagation Artificial Neural Network (BP-ANN) and Support Vector Machine (SVM) based on the 6 features extracted from the raw data. The WGAS, which includes a tri-axial accelerometer, 2 gyroscopes, and a MSP430 microcontroller, is worn by the subjects at either T4 (at back) or as a belt-clip in front of the waist during the various tests. The raw data is wirelessly transmitted from the WGAS to a near-by PC for real-time fall classification. The BP ANN is optimized by varying the training, testing and validation data sets and training the network with different learning schemes. SVM is optimized by using three different kernels and selecting the kernel for best classification rate. The overall accuracy of BP ANN is obtained as 98.20% with LM and RPROP training from the T4 data, while from the data taken at the belt, we achieved 98.70% with LM and SCG learning. The overall accuracy using SVM was 98.80% and 98.71% with RBF kernel from the T4 and belt position data, respectively.展开更多
This study was conducted to establish a Support Vector Machines(SVM)-Markov Chain prediction model for prediction of mining water inflow. According to the raw data sequence, the Support Vector Machines(SVM) model was ...This study was conducted to establish a Support Vector Machines(SVM)-Markov Chain prediction model for prediction of mining water inflow. According to the raw data sequence, the Support Vector Machines(SVM) model was built, and then revised by means of a Markov state change probability matrix. Through dividing the state and analyzing absolute errors and relative errors and other indexes of the measured value and the fitted value of SVM, the prediction results were improved. Finally,the model was used to calculate relative errors. Through predicting and analyzing mining water inflow, the prediction results of the model were satisfactory. The results of this study enlarge the application scope of the Support Vector Machines(SVM) prediction model and provide a new method for scientific forecasting water inflow in coal mining.展开更多
Aiming at the problems of the traditional method of assessing distribution of particle size in bench blasting, a support vector machines (SVMs) regression methodology was used to predict the mean particle size (X50...Aiming at the problems of the traditional method of assessing distribution of particle size in bench blasting, a support vector machines (SVMs) regression methodology was used to predict the mean particle size (X50) resulting from rock blast fragmentation in various mines based on the statistical learning theory. The data base consisted of blast design parameters, explosive parameters, modulus of elasticity and in-situ block size. The seven input independent variables used for the SVMs model for the prediction of X50 of rock blast fragmentation were the ratio of bench height to drilled burden (H/B), ratio of spacing to burden (S/B), ratio of burden to hole diameter (B/D), ratio of stemming to burden (T/B), powder factor (Pf), modulus of elasticity (E) and in-situ block size (XB). After using the 90 sets of the measured data in various mines and rock formations in the world for training and testing, the model was applied to 12 another blast data for validation of the trained support vector regression (SVR) model. The prediction results of SVR were compared with those of artificial neural network (ANN), multivariate regression analysis (MVRA) models, conventional Kuznetsov method and the measured X50 values. The proposed method shows promising results and the prediction accuracy of SVMs model is acceptable.展开更多
The support vector machine (SVM) is a novel machine learning method, which has the ability to approximate nonlinear functions with arbitrary accuracy. Setting parameters well is very crucial for SVM learning results...The support vector machine (SVM) is a novel machine learning method, which has the ability to approximate nonlinear functions with arbitrary accuracy. Setting parameters well is very crucial for SVM learning results and generalization ability, and now there is no systematic, general method for parameter selection. In this article, the SVM parameter selection for function approximation is regarded as a compound optimization problem and a mutative scale chaos optimization algorithm is employed to search for optimal paraxneter values. The chaos optimization algorithm is an effective way for global optimal and the mutative scale chaos algorithm could improve the search efficiency and accuracy. Several simulation examples show the sensitivity of the SVM parameters and demonstrate the superiority of this proposed method for nonlinear function approximation.展开更多
In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying result...In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.展开更多
To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of...To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of DTSVM highly depends on its structure, to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes, genetic algorithm is introduced into the formation of decision tree, so that the most separable classes would be separated at each node of decisions tree. Numerical simulations conducted on three datasets compared with "one-against-all" and "one-against-one" demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.展开更多
Seven factors, including the maximum volume of once flow , occurrence frequency of debris flow , watershed area , main channel length , watershed relative height difference , valley incision density and the length rat...Seven factors, including the maximum volume of once flow , occurrence frequency of debris flow , watershed area , main channel length , watershed relative height difference , valley incision density and the length ratio of sediment supplement are chosen as evaluation factors of debris flow hazard degree. Using support vector machine (SVM) theory, we selected 259 basic data of 37 debris flow channels in Yunnan Province as learning samples in this study. We create a debris flow hazard assessment model based on SVM. The model was validated though instance applications and showed encouraging results.展开更多
Laser-induced breakdown spectroscopy(LIBS) is a versatile tool for both qualitative and quantitative analysis.In this paper,LIBS combined with principal component analysis(PCA) and support vector machine(SVM) is...Laser-induced breakdown spectroscopy(LIBS) is a versatile tool for both qualitative and quantitative analysis.In this paper,LIBS combined with principal component analysis(PCA) and support vector machine(SVM) is applied to rock analysis.Fourteen emission lines including Fe,Mg,Ca,Al,Si,and Ti are selected as analysis lines.A good accuracy(91.38% for the real rock) is achieved by using SVM to analyze the spectroscopic peak area data which are processed by PCA.It can not only reduce the noise and dimensionality which contributes to improving the efficiency of the program,but also solve the problem of linear inseparability by combining PCA and SVM.By this method,the ability of LIBS to classify rock is validated.展开更多
By adopting the chaotic searching to improve the global searching performance of the particle swarm optimization (PSO), and using the improved PSO to optimize the key parameters of the support vector machine (SVM) for...By adopting the chaotic searching to improve the global searching performance of the particle swarm optimization (PSO), and using the improved PSO to optimize the key parameters of the support vector machine (SVM) forecasting model, an improved SVM model named CPSO-SVM model was proposed. The new model was applied to predicting the short term load, and the improved effect of the new model was proved. The simulation results of the South China Power Market’s actual data show that the new method can effectively improve the forecast accuracy by 2.23% and 3.87%, respectively, compared with the PSO-SVM and SVM methods. Compared with that of the PSO-SVM and SVM methods, the time cost of the new model is only increased by 3.15 and 4.61 s, respectively, which indicates that the CPSO-SVM model gains significant improved effects.展开更多
According to the chaotic and non-linear characters of power load data,the time series matrix is established with the theory of phase-space reconstruction,and then Lyapunov exponents with chaotic time series are comput...According to the chaotic and non-linear characters of power load data,the time series matrix is established with the theory of phase-space reconstruction,and then Lyapunov exponents with chaotic time series are computed to determine the time delay and the embedding dimension.Due to different features of the data,data mining algorithm is conducted to classify the data into different groups.Redundant information is eliminated by the advantage of data mining technology,and the historical loads that have highly similar features with the forecasting day are searched by the system.As a result,the training data can be decreased and the computing speed can also be improved when constructing support vector machine(SVM) model.Then,SVM algorithm is used to predict power load with parameters that get in pretreatment.In order to prove the effectiveness of the new model,the calculation with data mining SVM algorithm is compared with that of single SVM and back propagation network.It can be seen that the new DSVM algorithm effectively improves the forecast accuracy by 0.75%,1.10% and 1.73% compared with SVM for two random dimensions of 11-dimension,14-dimension and BP network,respectively.This indicates that the DSVM gains perfect improvement effect in the short-term power load forecasting.展开更多
In this paper, we present a novel Support Vector Machine active learning algorithm for effective 3D model retrieval using the concept of relevance feedback. The proposed method learns from the most informative objects...In this paper, we present a novel Support Vector Machine active learning algorithm for effective 3D model retrieval using the concept of relevance feedback. The proposed method learns from the most informative objects which are marked by the user, and then creates a boundary separating the relevant models from irrelevant ones. What it needs is only a small number of 3D models labelled by the user. It can grasp the user's semantic knowledge rapidly and accurately. Experimental results showed that the proposed algorithm significantly improves the retrieval effectiveness. Compared with four state-of-the-art query refinement schemes for 3D model retrieval, it provides superior retrieval performance after no more than two rounds of relevance feedback.展开更多
文摘The support vector machine(SVM)is a classical machine learning method.Both the hinge loss and least absolute shrinkage and selection operator(LASSO)penalty are usually used in traditional SVMs.However,the hinge loss is not differentiable,and the LASSO penalty does not have the Oracle property.In this paper,the huberized loss is combined with non-convex penalties to obtain a model that has the advantages of both the computational simplicity and the Oracle property,contributing to higher accuracy than traditional SVMs.It is experimentally demonstrated that the two non-convex huberized-SVM methods,smoothly clipped absolute deviation huberized-SVM(SCAD-HSVM)and minimax concave penalty huberized-SVM(MCP-HSVM),outperform the traditional SVM method in terms of the prediction accuracy and classifier performance.They are also superior in terms of variable selection,especially when there is a high linear correlation between the variables.When they are applied to the prediction of listed companies,the variables that can affect and predict financial distress are accurately filtered out.Among all the indicators,the indicators per share have the greatest influence while those of solvency have the weakest influence.Listed companies can assess the financial situation with the indicators screened by our algorithm and make an early warning of their possible financial distress in advance with higher precision.
基金supported by the National Basic Research Program of China (the 973 Program,Grant No.2010CB951101)the Program for Changjiang Scholars and Innovative Research Teams in Universities,the Ministry of Education,China (Grant No. IRT0717)
文摘Hybrid data assimilation (DA) is a method seeing more use in recent hydrology and water resources research. In this study, a DA method coupled with the support vector machines (SVMs) and the ensemble Kalman filter (EnKF) technology was used for the prediction of soil moisture in different soil layers: 0-5 cm, 30 cm, 50 cm, 100 cm, 200 cm, and 300 cm. The SVM methodology was first used to train the ground measurements of soil moisture and meteorological parameters from the Meilin study area, in East China, to construct soil moisture statistical prediction models. Subsequent observations and their statistics were used for predictions, with two approaches: the SVM predictor and the SVM-EnKF model made by coupling the SVM model with the EnKF technique using the DA method. Validation results showed that the proposed SVM-EnKF model can improve the prediction results of soil moisture in different layers, from the surface to the root zone.
基金supported in part by the Major Research Plan of the National Natural Science Foundation of China[grant number91530204]the State Key Program of the National Natural Science Foundation of China[grant number 41430426]
文摘The sea surface temperature (SST) has substantial impacts on the climate; however, due to its highly nonlinear nature, evidently non-periodic and strongly stochastic properties, it is rather difficult to predict SST. Here, the authors combine the complementary ensemble empirical mode decomposition (CEEMD) and support vector machine (SVM) methods to predict SST. Extensive tests from several different aspects are presented to validate the effectiveness of the CEEMD-SVM method. The results suggest that the new method works well in forecasting Northeast Pacific SST at a 12-month lead time, with an average absolute error of approximately 0.3℃ and a correlation coefficient of 0.85. Moreover, no spring predictability barrier is observed in our experiments.
基金Supported by the High Technology Research and Development Programme of China (2002AA412010), and the National Key Basic Research and Development Program of China (2002cb312200) and the National Natural Science Foundation of China (60174038).
文摘A support vector machine (SVM) ensemble classifier is proposed. Performance of SVM trained in an input space eonsisting of all the information from many sources is not always good. The strategy that the original input space is partitioned into several input subspaces usually works for improving the performance. Different from conventional partition methods, the partition method used in this paper, rough sets theory based attribute reduction, allows the input subspaces partially overlapped. These input subspaces can offer complementary information about hidden data patterns. In every subspace, an SVM sub-classifier is learned. With the information fusion techniques, those SVM sub-classifiers with better performance are selected and combined to construct an SVM ensemble. The proposed method is applied to decision-making of medical diagnosis. Comparison of performance between our method and several other popular ensemble methods is done. Experimental results demonstrate that our proposed approach can make full use of the information contained in data and improve the decision-making performance.
文摘In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.
基金This work was financially supported by National Natural Science Foundation of China(41972262)Hebei Natural Science Foundation for Excellent Young Scholars(D2020504032)+1 种基金Central Plains Science and technology innovation leader Project(214200510030)Key research and development Project of Henan province(221111321500).
文摘Landslide is a serious natural disaster next only to earthquake and flood,which will cause a great threat to people’s lives and property safety.The traditional research of landslide disaster based on experience-driven or statistical model and its assessment results are subjective,difficult to quantify,and no pertinence.As a new research method for landslide susceptibility assessment,machine learning can greatly improve the landslide susceptibility model’s accuracy by constructing statistical models.Taking Western Henan for example,the study selected 16 landslide influencing factors such as topography,geological environment,hydrological conditions,and human activities,and 11 landslide factors with the most significant influence on the landslide were selected by the recursive feature elimination(RFE)method.Five machine learning methods[Support Vector Machines(SVM),Logistic Regression(LR),Random Forest(RF),Extreme Gradient Boosting(XGBoost),and Linear Discriminant Analysis(LDA)]were used to construct the spatial distribution model of landslide susceptibility.The models were evaluated by the receiver operating characteristic curve and statistical index.After analysis and comparison,the XGBoost model(AUC 0.8759)performed the best and was suitable for dealing with regression problems.The model had a high adaptability to landslide data.According to the landslide susceptibility map of the five models,the overall distribution can be observed.The extremely high and high susceptibility areas are distributed in the Funiu Mountain range in the southwest,the Xiaoshan Mountain range in the west,and the Yellow River Basin in the north.These areas have large terrain fluctuations,complicated geological structural environments and frequent human engineering activities.The extremely high and highly prone areas were 12043.3 km^(2)and 3087.45 km^(2),accounting for 47.61%and 12.20%of the total area of the study area,respectively.Our study reflects the distribution of landslide susceptibility in western Henan Province,which provides a scientific basis for regional disaster warning,prediction,and resource protection.The study has important practical significance for subsequent landslide disaster management.
基金Supported by the National Natural Science Foundation of China(60473035)~~
文摘The relationship among Mercer kernel, reproducing kernel and positive definite kernel in support vector machine (SVM) is proved and their roles in SVM are discussed. The quadratic form of the kernel matrix is used to confirm the positive definiteness and their construction. Based on the Bochner theorem, some translation invariant kernels are checked in their Fourier domain. Some rotation invariant radial kernels are inspected according to the Schoenberg theorem. Finally, the construction of discrete scaling and wavelet kernels, the kernel selection and the kernel parameter learning are discussed.
基金This work was supported by National Basic Research Programof China under Grant2002cb312200 01 3National Nature ScienceFoundation of China under Grant60174038.
文摘Support vector machines (SVMs) have been introduced as effective methods for solving classification problems. However, due to some limitations in practical applications, their generalization performance is sometimes far from the expected level. Therefore, it is meaningful to study SVM ensemble learning. In this paper, a novel genetic algorithm based ensemble learning method, namely Direct Genetic Ensemble (DGE), is proposed. DGE adopts the predictive accuracy of ensemble as the fitness function and searches a good ensemble from the ensemble space. In essence, DGE is also a selective ensemble learning method because the base classifiers of the ensemble are selected according to the solution of genetic algorithm. In comparison with other ensemble learning methods, DGE works on a higher level and is more direct. Different strategies of constructing diverse base classifiers can be utilized in DGE. Experimental results show that SVM ensembles constructed by DGE can achieve better performance than single SVMs, hagged and boosted SVM ensembles. In addition, some valuable conclusions are obtained.
文摘Urban living in large modern cities exerts considerable adverse effectson health and thus increases the risk of contracting several chronic kidney diseases (CKD). The prediction of CKDs has become a major task in urbanizedcountries. The primary objective of this work is to introduce and develop predictive analytics for predicting CKDs. However, prediction of huge samples isbecoming increasingly difficult. Meanwhile, MapReduce provides a feasible framework for programming predictive algorithms with map and reduce functions.The relatively simple programming interface helps solve problems in the scalability and efficiency of predictive learning algorithms. In the proposed work, theiterative weighted map reduce framework is introduced for the effective management of large dataset samples. A binary classification problem is formulated usingensemble nonlinear support vector machines and random forests. Thus, instead ofusing the normal linear combination of kernel activations, the proposed work creates nonlinear combinations of kernel activations in prototype examples. Furthermore, different descriptors are combined in an ensemble of deep support vectormachines, where the product rule is used to combine probability estimates ofdifferent classifiers. Performance is evaluated in terms of the prediction accuracyand interpretability of the model and the results.
文摘In this work, a total of 322 tests were taken on young volunteers by performing 10 different falls, 6 different Activities of Daily Living (ADL) and 7 Dynamic Gait Index (DGI) tests using a custom-designed Wireless Gait Analysis Sensor (WGAS). In order to perform automatic fall detection, we used Back Propagation Artificial Neural Network (BP-ANN) and Support Vector Machine (SVM) based on the 6 features extracted from the raw data. The WGAS, which includes a tri-axial accelerometer, 2 gyroscopes, and a MSP430 microcontroller, is worn by the subjects at either T4 (at back) or as a belt-clip in front of the waist during the various tests. The raw data is wirelessly transmitted from the WGAS to a near-by PC for real-time fall classification. The BP ANN is optimized by varying the training, testing and validation data sets and training the network with different learning schemes. SVM is optimized by using three different kernels and selecting the kernel for best classification rate. The overall accuracy of BP ANN is obtained as 98.20% with LM and RPROP training from the T4 data, while from the data taken at the belt, we achieved 98.70% with LM and SCG learning. The overall accuracy using SVM was 98.80% and 98.71% with RBF kernel from the T4 and belt position data, respectively.
文摘This study was conducted to establish a Support Vector Machines(SVM)-Markov Chain prediction model for prediction of mining water inflow. According to the raw data sequence, the Support Vector Machines(SVM) model was built, and then revised by means of a Markov state change probability matrix. Through dividing the state and analyzing absolute errors and relative errors and other indexes of the measured value and the fitted value of SVM, the prediction results were improved. Finally,the model was used to calculate relative errors. Through predicting and analyzing mining water inflow, the prediction results of the model were satisfactory. The results of this study enlarge the application scope of the Support Vector Machines(SVM) prediction model and provide a new method for scientific forecasting water inflow in coal mining.
基金Foundation item:Project (2006BAB02A02) supported by the National Key Technology R&D Program during the 11th Five-year Plan Period of ChinaProject (CX2011B119) supported by the Graduated Students' Research and Innovation Fund of Hunan Province, ChinaProject (2009ssxt230) supported by the Central South University Innovation Fund,China
文摘Aiming at the problems of the traditional method of assessing distribution of particle size in bench blasting, a support vector machines (SVMs) regression methodology was used to predict the mean particle size (X50) resulting from rock blast fragmentation in various mines based on the statistical learning theory. The data base consisted of blast design parameters, explosive parameters, modulus of elasticity and in-situ block size. The seven input independent variables used for the SVMs model for the prediction of X50 of rock blast fragmentation were the ratio of bench height to drilled burden (H/B), ratio of spacing to burden (S/B), ratio of burden to hole diameter (B/D), ratio of stemming to burden (T/B), powder factor (Pf), modulus of elasticity (E) and in-situ block size (XB). After using the 90 sets of the measured data in various mines and rock formations in the world for training and testing, the model was applied to 12 another blast data for validation of the trained support vector regression (SVR) model. The prediction results of SVR were compared with those of artificial neural network (ANN), multivariate regression analysis (MVRA) models, conventional Kuznetsov method and the measured X50 values. The proposed method shows promising results and the prediction accuracy of SVMs model is acceptable.
基金the National Nature Science Foundation of China (60775047, 60402024)
文摘The support vector machine (SVM) is a novel machine learning method, which has the ability to approximate nonlinear functions with arbitrary accuracy. Setting parameters well is very crucial for SVM learning results and generalization ability, and now there is no systematic, general method for parameter selection. In this article, the SVM parameter selection for function approximation is regarded as a compound optimization problem and a mutative scale chaos optimization algorithm is employed to search for optimal paraxneter values. The chaos optimization algorithm is an effective way for global optimal and the mutative scale chaos algorithm could improve the search efficiency and accuracy. Several simulation examples show the sensitivity of the SVM parameters and demonstrate the superiority of this proposed method for nonlinear function approximation.
基金Project supported by the National Basic Research Program (973) of China (No. 2002CB312200) and the Center for Bioinformatics Pro-gram Grant of Harvard Center of Neurodegeneration and Repair,Harvard Medical School, Harvard University, Boston, USA
文摘In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.
基金supported by the National Natural Science Foundation of China (60604021 60874054)
文摘To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of DTSVM highly depends on its structure, to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes, genetic algorithm is introduced into the formation of decision tree, so that the most separable classes would be separated at each node of decisions tree. Numerical simulations conducted on three datasets compared with "one-against-all" and "one-against-one" demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.
文摘Seven factors, including the maximum volume of once flow , occurrence frequency of debris flow , watershed area , main channel length , watershed relative height difference , valley incision density and the length ratio of sediment supplement are chosen as evaluation factors of debris flow hazard degree. Using support vector machine (SVM) theory, we selected 259 basic data of 37 debris flow channels in Yunnan Province as learning samples in this study. We create a debris flow hazard assessment model based on SVM. The model was validated though instance applications and showed encouraging results.
基金Project supported by the National Natural Science Foundation of China(Grant No.11075184)the Knowledge Innovation Program of the Chinese Academy of Sciences(CAS)(Grant No.Y03RC21124)the CAS President’s International Fellowship Initiative Foundation(Grant No.2015VMA007)
文摘Laser-induced breakdown spectroscopy(LIBS) is a versatile tool for both qualitative and quantitative analysis.In this paper,LIBS combined with principal component analysis(PCA) and support vector machine(SVM) is applied to rock analysis.Fourteen emission lines including Fe,Mg,Ca,Al,Si,and Ti are selected as analysis lines.A good accuracy(91.38% for the real rock) is achieved by using SVM to analyze the spectroscopic peak area data which are processed by PCA.It can not only reduce the noise and dimensionality which contributes to improving the efficiency of the program,but also solve the problem of linear inseparability by combining PCA and SVM.By this method,the ability of LIBS to classify rock is validated.
基金Project(70572090) supported by the National Natural Science Foundation of China
文摘By adopting the chaotic searching to improve the global searching performance of the particle swarm optimization (PSO), and using the improved PSO to optimize the key parameters of the support vector machine (SVM) forecasting model, an improved SVM model named CPSO-SVM model was proposed. The new model was applied to predicting the short term load, and the improved effect of the new model was proved. The simulation results of the South China Power Market’s actual data show that the new method can effectively improve the forecast accuracy by 2.23% and 3.87%, respectively, compared with the PSO-SVM and SVM methods. Compared with that of the PSO-SVM and SVM methods, the time cost of the new model is only increased by 3.15 and 4.61 s, respectively, which indicates that the CPSO-SVM model gains significant improved effects.
基金Project(70671039) supported by the National Natural Science Foundation of China
文摘According to the chaotic and non-linear characters of power load data,the time series matrix is established with the theory of phase-space reconstruction,and then Lyapunov exponents with chaotic time series are computed to determine the time delay and the embedding dimension.Due to different features of the data,data mining algorithm is conducted to classify the data into different groups.Redundant information is eliminated by the advantage of data mining technology,and the historical loads that have highly similar features with the forecasting day are searched by the system.As a result,the training data can be decreased and the computing speed can also be improved when constructing support vector machine(SVM) model.Then,SVM algorithm is used to predict power load with parameters that get in pretreatment.In order to prove the effectiveness of the new model,the calculation with data mining SVM algorithm is compared with that of single SVM and back propagation network.It can be seen that the new DSVM algorithm effectively improves the forecast accuracy by 0.75%,1.10% and 1.73% compared with SVM for two random dimensions of 11-dimension,14-dimension and BP network,respectively.This indicates that the DSVM gains perfect improvement effect in the short-term power load forecasting.
基金the National Basic Research Program (973) of China (No. 2004CB719401)the National Research Foundation for the Doctoral Program of Higher Education of China (No.20060003060)
文摘In this paper, we present a novel Support Vector Machine active learning algorithm for effective 3D model retrieval using the concept of relevance feedback. The proposed method learns from the most informative objects which are marked by the user, and then creates a boundary separating the relevant models from irrelevant ones. What it needs is only a small number of 3D models labelled by the user. It can grasp the user's semantic knowledge rapidly and accurately. Experimental results showed that the proposed algorithm significantly improves the retrieval effectiveness. Compared with four state-of-the-art query refinement schemes for 3D model retrieval, it provides superior retrieval performance after no more than two rounds of relevance feedback.