Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most...Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most commonly used formulations of support vector machines for regression (SVRs) aiming to emphasize its usability on large-scale applications. We review the general concept of support vector machines (SVMs), address the state-of-the-art on training methods SVMs, and explain the fundamental principle of SVRs. The most common learning methods for SVRs are introduced and linear programming-based SVR formulations are explained emphasizing its suitability for large-scale learning. Finally, this paper also discusses some open problems and current trends.展开更多
Used for industrial process with different degree of nonlinearity, the two predictive control algorithms presented in this paper are based on Least Squares Support Vector Machines (LS-SVM) model. For the weakly nonlin...Used for industrial process with different degree of nonlinearity, the two predictive control algorithms presented in this paper are based on Least Squares Support Vector Machines (LS-SVM) model. For the weakly nonlinear system, the system model is built by using LS-SVM with linear kernel function, and then the obtained linear LS-SVM model is transformed into linear input-output relation of the controlled system. However, for the strongly nonlinear system, the off-line model of the controlled system is built by using LS-SVM with Radial Basis Function (RBF) kernel. The obtained nonlinear LS-SVM model is linearized at each sampling instant of system running, after which the on-line linear input-output model of the system is built. Based on the obtained linear input-output model, the Generalized Predictive Control (GPC) algorithm is employed to implement predictive control for the controlled plant in both algorithms. The simulation results after the presented algorithms were implemented in two different industrial processes model; respectively revealed the effectiveness and merit of both algorithms.展开更多
Despite of its great efficiency for pattern classification, proximal supportvector machines (PSVM), a new version of SVM proposed recently, is sensitive to noise and outliers.To overcome the drawback, this paper modif...Despite of its great efficiency for pattern classification, proximal supportvector machines (PSVM), a new version of SVM proposed recently, is sensitive to noise and outliers.To overcome the drawback, this paper modifies PSVM by associating a weightvalue with each input dataof PSVM. The distance between each data point and the center of corresponding class is used tocalculate the weight value. In this way, the effect of noise is reduced. The experiments indicatethat new SVM, weighted proximal support vector machine (WPSVM), is much more robust to noise thanPSVM without loss of computationally attractive feature of PSVM.展开更多
In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According t...In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.展开更多
Temperature prediction plays an important role in ring die granulator control,which can influence the quantity and quality of production. Temperature prediction modeling is a complicated problem with its MIMO, nonline...Temperature prediction plays an important role in ring die granulator control,which can influence the quantity and quality of production. Temperature prediction modeling is a complicated problem with its MIMO, nonlinear, and large time-delay characteristics. Support vector machine( SVM) has been successfully based on small data. But its accuracy is not high,in contrast,if the number of data and dimension of feature increase,the training time of model will increase dramatically. In this paper,a linear SVM was applied combing with cyclic coordinate descent( CCD) to solving big data regression. It was mathematically strictly proved and validated by simulation. Meanwhile,real data were conducted to prove the linear SVM model's effect. Compared with other methods for big data in simulation, this algorithm has apparent advantage not only in fast modeling but also in high fitness.展开更多
To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed. In the proposed ...To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed. In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model. The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given. An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.展开更多
In this research work,we proposed a medical image analysis framework with two separate releases whether or not Synovial Sarcoma(SS)is the cell structure for cancer.Within this framework the histopathology images are d...In this research work,we proposed a medical image analysis framework with two separate releases whether or not Synovial Sarcoma(SS)is the cell structure for cancer.Within this framework the histopathology images are decomposed into a third-level sub-band using a two-dimensional Discrete Wavelet Transform.Subsequently,the structure features(SFs)such as PrincipalComponentsAnalysis(PCA),Independent ComponentsAnalysis(ICA)and Linear Discriminant Analysis(LDA)were extracted from this subband image representation with the distribution of wavelet coefficients.These SFs are used as inputs of the Support Vector Machine(SVM)classifier.Also,classification of PCA+SVM,ICA+SVM,and LDA+SVM with Radial Basis Function(RBF)kernel the efficiency of the process is differentiated and compared with the best classification results.Furthermore,data collected on the internet from various histopathological centres via the Internet of Things(IoT)are stored and shared on blockchain technology across a wide range of image distribution across secure data IoT devices.Due to this,the minimum and maximum values of the kernel parameter are adjusted and updated periodically for the purpose of industrial application in device calibration.Consequently,these resolutions are presented with an excellent example of a technique for training and testing the cancer cell structure prognosis methods in spindle shaped cell(SSC)histopathological imaging databases.The performance characteristics of cross-validation are evaluated with the help of the receiver operating characteristics(ROC)curve,and significant differences in classification performance between the techniques are analyzed.The combination of LDA+SVM technique has been proven to be essential for intelligent SS cancer detection in the future,and it offers excellent classification accuracy,sensitivity,specificity.展开更多
In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of ...In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.展开更多
Selecting the optimal parameters for support vector machine (SVM) has long been a hot research topic. Aiming for support vector classification/regression (SVC/SVR) with the radial basis function (RBF) kernel, we summa...Selecting the optimal parameters for support vector machine (SVM) has long been a hot research topic. Aiming for support vector classification/regression (SVC/SVR) with the radial basis function (RBF) kernel, we summarize the rough line rule of the penalty parameter and kernel width, and propose a novel linear search method to obtain these two optimal parameters. We use a direct-setting method with thresholds to set the epsilon parameter of SVR. The proposed method directly locates the right search field, which greatly saves computing time and achieves a stable, high accuracy. The method is more competitive for both SVC and SVR. It is easy to use and feasible for a new data set without any adjustments, since it requires no parameters to set.展开更多
Landslide is a serious natural disaster next only to earthquake and flood,which will cause a great threat to people’s lives and property safety.The traditional research of landslide disaster based on experience-drive...Landslide is a serious natural disaster next only to earthquake and flood,which will cause a great threat to people’s lives and property safety.The traditional research of landslide disaster based on experience-driven or statistical model and its assessment results are subjective,difficult to quantify,and no pertinence.As a new research method for landslide susceptibility assessment,machine learning can greatly improve the landslide susceptibility model’s accuracy by constructing statistical models.Taking Western Henan for example,the study selected 16 landslide influencing factors such as topography,geological environment,hydrological conditions,and human activities,and 11 landslide factors with the most significant influence on the landslide were selected by the recursive feature elimination(RFE)method.Five machine learning methods[Support Vector Machines(SVM),Logistic Regression(LR),Random Forest(RF),Extreme Gradient Boosting(XGBoost),and Linear Discriminant Analysis(LDA)]were used to construct the spatial distribution model of landslide susceptibility.The models were evaluated by the receiver operating characteristic curve and statistical index.After analysis and comparison,the XGBoost model(AUC 0.8759)performed the best and was suitable for dealing with regression problems.The model had a high adaptability to landslide data.According to the landslide susceptibility map of the five models,the overall distribution can be observed.The extremely high and high susceptibility areas are distributed in the Funiu Mountain range in the southwest,the Xiaoshan Mountain range in the west,and the Yellow River Basin in the north.These areas have large terrain fluctuations,complicated geological structural environments and frequent human engineering activities.The extremely high and highly prone areas were 12043.3 km^(2)and 3087.45 km^(2),accounting for 47.61%and 12.20%of the total area of the study area,respectively.Our study reflects the distribution of landslide susceptibility in western Henan Province,which provides a scientific basis for regional disaster warning,prediction,and resource protection.The study has important practical significance for subsequent landslide disaster management.展开更多
Interior Alaska has a short growing season of 110 d.The knowledge of timings of crop flowering and maturity will provide the information for the agricultural decision making.In this study,six machine learning algorith...Interior Alaska has a short growing season of 110 d.The knowledge of timings of crop flowering and maturity will provide the information for the agricultural decision making.In this study,six machine learning algorithms,namely Linear Discriminant Analysis(LDA),Support Vector Machines(SVMs),k-nearest neighbor(kNN),Naïve Bayes(NB),Recursive Partitioning and Regression Trees(RPART),and Random Forest(RF),were selected to forecast the timings of barley flowering and maturity based on the Alaska Crop Datasets and climate data from 1991 to 2016 in Fairbanks,Alaska.Among 32 models fit to forecast flowering time,two from LDA,12 from SVMs,four from NB,three from RF outperformed models from other algorithms with the highest accuracy.Models from kNN performed worst to forecast flowering time.Among 32 models fit to forecast maturity time,two models from LDA outperformed the models from other algorithms.Models from kNN and RPART performed worst to forecast maturity time.Models from machine learning methods also provided a variable importance explanation.In this study,four out of six algorithms gave the same variable importance order.Sowing date was the most important variable to forecast flowering but less important variable to forecast maturity.The daily maximum temperature may be more important than daily minimum temperature to fit flowering models while daily minimum temperature may be more important than daily maximum temperature to fit maturity models.The results indicate that models from machine learning provide a promising technique in forecasting the timings of flowering and maturity of barley.展开更多
Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influe...Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.展开更多
The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysph...The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysphonia that are caused by voice alteration of vocal folds and their accuracy is between 60%–70%.To enhance detection accuracy and reduce processing speed of dysphonia detection,a novel approach is proposed in this paper.We have leveraged Linear Discriminant Analysis(LDA)to train multiple Machine Learning(ML)models for dysphonia detection.Several ML models are utilized like Support Vector Machine(SVM),Logistic Regression,and K-nearest neighbor(K-NN)to predict the voice pathologies based on features like Mel-Frequency Cepstral Coefficients(MFCC),Fundamental Frequency(F0),Shimmer(%),Jitter(%),and Harmonic to Noise Ratio(HNR).The experiments were performed using Saarbrucken Voice Data-base(SVD)and a privately collected dataset.The K-fold cross-validation approach was incorporated to increase the robustness and stability of the ML models.According to the experimental results,our proposed approach has a 70%increase in processing speed over Principal Component Analysis(PCA)and performs remarkably well with a recognition accuracy of 95.24%on the SVD dataset surpassing the previous best accuracy of 82.37%.In the case of the private dataset,our proposed method achieved an accuracy rate of 93.37%.It can be an effective non-invasive method to detect dysphonia.展开更多
In this paper, we extend our previous study of addressing the important problem of automatically identifying question and non-question segments in Arabic monologues using prosodic features. We propose here two novel c...In this paper, we extend our previous study of addressing the important problem of automatically identifying question and non-question segments in Arabic monologues using prosodic features. We propose here two novel classification approaches to this problem: one based on the use of the powerful type-2 fuzzy logic systems (type-2 FLS) and the other on the use of the discriminative sensitivity-based linear learning method (SBLLM). The use of prosodic features has been used in a plethora of practical applications, including speech-related applications, such as speaker and word recognition, emotion and accent identification, topic and sentence segmentation, and text-to-speech applications. In this paper, we continue to specifically focus on the Arabic language, as other languages have received a lot of attention in this regard. Moreover, we aim to improve the performance of our previously-used techniques, of which the support vector machine (SVM) method was the best performing, by applying the two above-mentioned powerful classification approaches. The recorded continuous speech is first segmented into sentences using both energy and time duration parameters. The prosodic features are then extracted from each sentence and fed into each of the two proposed classifiers so as to classify each sentence as a Question or a Non-Question sentence. Our extensive simulation work, based on a moderately-sized database, showed the two proposed classifiers outperform SVM in all of the experiments carried out, with the type-2 FLS classifier consistently exhibiting the best performance, because of its ability to handle all forms of uncertainties.展开更多
文摘Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most commonly used formulations of support vector machines for regression (SVRs) aiming to emphasize its usability on large-scale applications. We review the general concept of support vector machines (SVMs), address the state-of-the-art on training methods SVMs, and explain the fundamental principle of SVRs. The most common learning methods for SVRs are introduced and linear programming-based SVR formulations are explained emphasizing its suitability for large-scale learning. Finally, this paper also discusses some open problems and current trends.
基金Project supported by the National Outstanding Youth ScienceFoundation of China (No. 60025308) and the Teach and ResearchAward Program for Outstanding Young Teachers in Higher EducationInstitutions of MOE, China
文摘Used for industrial process with different degree of nonlinearity, the two predictive control algorithms presented in this paper are based on Least Squares Support Vector Machines (LS-SVM) model. For the weakly nonlinear system, the system model is built by using LS-SVM with linear kernel function, and then the obtained linear LS-SVM model is transformed into linear input-output relation of the controlled system. However, for the strongly nonlinear system, the off-line model of the controlled system is built by using LS-SVM with Radial Basis Function (RBF) kernel. The obtained nonlinear LS-SVM model is linearized at each sampling instant of system running, after which the on-line linear input-output model of the system is built. Based on the obtained linear input-output model, the Generalized Predictive Control (GPC) algorithm is employed to implement predictive control for the controlled plant in both algorithms. The simulation results after the presented algorithms were implemented in two different industrial processes model; respectively revealed the effectiveness and merit of both algorithms.
文摘Despite of its great efficiency for pattern classification, proximal supportvector machines (PSVM), a new version of SVM proposed recently, is sensitive to noise and outliers.To overcome the drawback, this paper modifies PSVM by associating a weightvalue with each input dataof PSVM. The distance between each data point and the center of corresponding class is used tocalculate the weight value. In this way, the effect of noise is reduced. The experiments indicatethat new SVM, weighted proximal support vector machine (WPSVM), is much more robust to noise thanPSVM without loss of computationally attractive feature of PSVM.
文摘In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.
基金Nantong Research Program of Application Foundation,China(No.BK2012030)Key Project of Science and Technology Commission of Shanghai Municipality,China(No.10JC1405000)
文摘Temperature prediction plays an important role in ring die granulator control,which can influence the quantity and quality of production. Temperature prediction modeling is a complicated problem with its MIMO, nonlinear, and large time-delay characteristics. Support vector machine( SVM) has been successfully based on small data. But its accuracy is not high,in contrast,if the number of data and dimension of feature increase,the training time of model will increase dramatically. In this paper,a linear SVM was applied combing with cyclic coordinate descent( CCD) to solving big data regression. It was mathematically strictly proved and validated by simulation. Meanwhile,real data were conducted to prove the linear SVM model's effect. Compared with other methods for big data in simulation, this algorithm has apparent advantage not only in fast modeling but also in high fitness.
基金the National Natural Science Foundation of China (70471074)China Postdoctoral Science Foundation(2005038042)Department of Science and Technology of Guangdong Province(2004B36001051).
文摘To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed. In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model. The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given. An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.
基金This work was partly supported by the Technology development Program of MSS[No.S3033853]by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2020R1I1A3069700).
文摘In this research work,we proposed a medical image analysis framework with two separate releases whether or not Synovial Sarcoma(SS)is the cell structure for cancer.Within this framework the histopathology images are decomposed into a third-level sub-band using a two-dimensional Discrete Wavelet Transform.Subsequently,the structure features(SFs)such as PrincipalComponentsAnalysis(PCA),Independent ComponentsAnalysis(ICA)and Linear Discriminant Analysis(LDA)were extracted from this subband image representation with the distribution of wavelet coefficients.These SFs are used as inputs of the Support Vector Machine(SVM)classifier.Also,classification of PCA+SVM,ICA+SVM,and LDA+SVM with Radial Basis Function(RBF)kernel the efficiency of the process is differentiated and compared with the best classification results.Furthermore,data collected on the internet from various histopathological centres via the Internet of Things(IoT)are stored and shared on blockchain technology across a wide range of image distribution across secure data IoT devices.Due to this,the minimum and maximum values of the kernel parameter are adjusted and updated periodically for the purpose of industrial application in device calibration.Consequently,these resolutions are presented with an excellent example of a technique for training and testing the cancer cell structure prognosis methods in spindle shaped cell(SSC)histopathological imaging databases.The performance characteristics of cross-validation are evaluated with the help of the receiver operating characteristics(ROC)curve,and significant differences in classification performance between the techniques are analyzed.The combination of LDA+SVM technique has been proven to be essential for intelligent SS cancer detection in the future,and it offers excellent classification accuracy,sensitivity,specificity.
基金Supported by the Ministry of Environmental Protection of China(No.2011467037)
文摘In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.
基金supported by the National Basic Research Program (973) of China (No. 2009CB724006)the National Natural Science Foun-dation of China (No. 60977010)
文摘Selecting the optimal parameters for support vector machine (SVM) has long been a hot research topic. Aiming for support vector classification/regression (SVC/SVR) with the radial basis function (RBF) kernel, we summarize the rough line rule of the penalty parameter and kernel width, and propose a novel linear search method to obtain these two optimal parameters. We use a direct-setting method with thresholds to set the epsilon parameter of SVR. The proposed method directly locates the right search field, which greatly saves computing time and achieves a stable, high accuracy. The method is more competitive for both SVC and SVR. It is easy to use and feasible for a new data set without any adjustments, since it requires no parameters to set.
基金This work was financially supported by National Natural Science Foundation of China(41972262)Hebei Natural Science Foundation for Excellent Young Scholars(D2020504032)+1 种基金Central Plains Science and technology innovation leader Project(214200510030)Key research and development Project of Henan province(221111321500).
文摘Landslide is a serious natural disaster next only to earthquake and flood,which will cause a great threat to people’s lives and property safety.The traditional research of landslide disaster based on experience-driven or statistical model and its assessment results are subjective,difficult to quantify,and no pertinence.As a new research method for landslide susceptibility assessment,machine learning can greatly improve the landslide susceptibility model’s accuracy by constructing statistical models.Taking Western Henan for example,the study selected 16 landslide influencing factors such as topography,geological environment,hydrological conditions,and human activities,and 11 landslide factors with the most significant influence on the landslide were selected by the recursive feature elimination(RFE)method.Five machine learning methods[Support Vector Machines(SVM),Logistic Regression(LR),Random Forest(RF),Extreme Gradient Boosting(XGBoost),and Linear Discriminant Analysis(LDA)]were used to construct the spatial distribution model of landslide susceptibility.The models were evaluated by the receiver operating characteristic curve and statistical index.After analysis and comparison,the XGBoost model(AUC 0.8759)performed the best and was suitable for dealing with regression problems.The model had a high adaptability to landslide data.According to the landslide susceptibility map of the five models,the overall distribution can be observed.The extremely high and high susceptibility areas are distributed in the Funiu Mountain range in the southwest,the Xiaoshan Mountain range in the west,and the Yellow River Basin in the north.These areas have large terrain fluctuations,complicated geological structural environments and frequent human engineering activities.The extremely high and highly prone areas were 12043.3 km^(2)and 3087.45 km^(2),accounting for 47.61%and 12.20%of the total area of the study area,respectively.Our study reflects the distribution of landslide susceptibility in western Henan Province,which provides a scientific basis for regional disaster warning,prediction,and resource protection.The study has important practical significance for subsequent landslide disaster management.
文摘Interior Alaska has a short growing season of 110 d.The knowledge of timings of crop flowering and maturity will provide the information for the agricultural decision making.In this study,six machine learning algorithms,namely Linear Discriminant Analysis(LDA),Support Vector Machines(SVMs),k-nearest neighbor(kNN),Naïve Bayes(NB),Recursive Partitioning and Regression Trees(RPART),and Random Forest(RF),were selected to forecast the timings of barley flowering and maturity based on the Alaska Crop Datasets and climate data from 1991 to 2016 in Fairbanks,Alaska.Among 32 models fit to forecast flowering time,two from LDA,12 from SVMs,four from NB,three from RF outperformed models from other algorithms with the highest accuracy.Models from kNN performed worst to forecast flowering time.Among 32 models fit to forecast maturity time,two models from LDA outperformed the models from other algorithms.Models from kNN and RPART performed worst to forecast maturity time.Models from machine learning methods also provided a variable importance explanation.In this study,four out of six algorithms gave the same variable importance order.Sowing date was the most important variable to forecast flowering but less important variable to forecast maturity.The daily maximum temperature may be more important than daily minimum temperature to fit flowering models while daily minimum temperature may be more important than daily maximum temperature to fit maturity models.The results indicate that models from machine learning provide a promising technique in forecasting the timings of flowering and maturity of barley.
基金Foundations of National High Technology (863) Programme (Grant No. 2006AA02Z312)Innovative Group Programme for Graduates of Chongqing Uni-versity, Science and Innovation Fund (Grant No. 200711C1A0010260)+4 种基金National 111 Programme Introducing Talents of Discipline to Universities (Grant No. 0507111106)Chongqing Municipality Basic and Applied Fundamental Science Fund (Grant No. 01-3-6)National Chunhui Project Foundation (Grant No. 99-4-4+3-7)State Key Laboratory of Chemo/Biosensing and Chemometrics Fund (Grant No.2005012)Fok-Yingtung Educational Foundation (Grant No. 98-7-6)
文摘Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.
文摘The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysphonia that are caused by voice alteration of vocal folds and their accuracy is between 60%–70%.To enhance detection accuracy and reduce processing speed of dysphonia detection,a novel approach is proposed in this paper.We have leveraged Linear Discriminant Analysis(LDA)to train multiple Machine Learning(ML)models for dysphonia detection.Several ML models are utilized like Support Vector Machine(SVM),Logistic Regression,and K-nearest neighbor(K-NN)to predict the voice pathologies based on features like Mel-Frequency Cepstral Coefficients(MFCC),Fundamental Frequency(F0),Shimmer(%),Jitter(%),and Harmonic to Noise Ratio(HNR).The experiments were performed using Saarbrucken Voice Data-base(SVD)and a privately collected dataset.The K-fold cross-validation approach was incorporated to increase the robustness and stability of the ML models.According to the experimental results,our proposed approach has a 70%increase in processing speed over Principal Component Analysis(PCA)and performs remarkably well with a recognition accuracy of 95.24%on the SVD dataset surpassing the previous best accuracy of 82.37%.In the case of the private dataset,our proposed method achieved an accuracy rate of 93.37%.It can be an effective non-invasive method to detect dysphonia.
文摘In this paper, we extend our previous study of addressing the important problem of automatically identifying question and non-question segments in Arabic monologues using prosodic features. We propose here two novel classification approaches to this problem: one based on the use of the powerful type-2 fuzzy logic systems (type-2 FLS) and the other on the use of the discriminative sensitivity-based linear learning method (SBLLM). The use of prosodic features has been used in a plethora of practical applications, including speech-related applications, such as speaker and word recognition, emotion and accent identification, topic and sentence segmentation, and text-to-speech applications. In this paper, we continue to specifically focus on the Arabic language, as other languages have received a lot of attention in this regard. Moreover, we aim to improve the performance of our previously-used techniques, of which the support vector machine (SVM) method was the best performing, by applying the two above-mentioned powerful classification approaches. The recorded continuous speech is first segmented into sentences using both energy and time duration parameters. The prosodic features are then extracted from each sentence and fed into each of the two proposed classifiers so as to classify each sentence as a Question or a Non-Question sentence. Our extensive simulation work, based on a moderately-sized database, showed the two proposed classifiers outperform SVM in all of the experiments carried out, with the type-2 FLS classifier consistently exhibiting the best performance, because of its ability to handle all forms of uncertainties.