With the widespread data collection and processing,privacy-preserving machine learning has become increasingly important in addressing privacy risks related to individuals.Support vector machine(SVM)is one of the most...With the widespread data collection and processing,privacy-preserving machine learning has become increasingly important in addressing privacy risks related to individuals.Support vector machine(SVM)is one of the most elementary learning models of machine learning.Privacy issues surrounding SVM classifier training have attracted increasing attention.In this paper,we investigate Differential Privacy-compliant Federated Machine Learning with Dimensionality Reduction,called FedDPDR-DPML,which greatly improves data utility while providing strong privacy guarantees.Considering in distributed learning scenarios,multiple participants usually hold unbalanced or small amounts of data.Therefore,FedDPDR-DPML enables multiple participants to collaboratively learn a global model based on weighted model averaging and knowledge aggregation and then the server distributes the global model to each participant to improve local data utility.Aiming at high-dimensional data,we adopt differential privacy in both the principal component analysis(PCA)-based dimensionality reduction phase and SVM classifiers training phase,which improves model accuracy while achieving strict differential privacy protection.Besides,we train Differential privacy(DP)-compliant SVM classifiers by adding noise to the objective function itself,thus leading to better data utility.Extensive experiments on three high-dimensional datasets demonstrate that FedDPDR-DPML can achieve high accuracy while ensuring strong privacy protection.展开更多
The selection of important factors in machine learning-based susceptibility assessments is crucial to obtain reliable susceptibility results.In this study,metaheuristic optimization and feature selection techniques we...The selection of important factors in machine learning-based susceptibility assessments is crucial to obtain reliable susceptibility results.In this study,metaheuristic optimization and feature selection techniques were applied to identify the most important input parameters for mapping debris flow susceptibility in the southern mountain area of Chengde City in Hebei Province,China,by using machine learning algorithms.In total,133 historical debris flow records and 16 related factors were selected.The support vector machine(SVM)was first used as the base classifier,and then a hybrid model was introduced by a two-step process.First,the particle swarm optimization(PSO)algorithm was employed to select the SVM model hyperparameters.Second,two feature selection algorithms,namely principal component analysis(PCA)and PSO,were integrated into the PSO-based SVM model,which generated the PCA-PSO-SVM and FS-PSO-SVM models,respectively.Three statistical metrics(accuracy,recall,and specificity)and the area under the receiver operating characteristic curve(AUC)were employed to evaluate and validate the performance of the models.The results indicated that the feature selection-based models exhibited the best performance,followed by the PSO-based SVM and SVM models.Moreover,the performance of the FS-PSO-SVM model was better than that of the PCA-PSO-SVM model,showing the highest AUC,accuracy,recall,and specificity values in both the training and testing processes.It was found that the selection of optimal features is crucial to improving the reliability of debris flow susceptibility assessment results.Moreover,the PSO algorithm was found to be not only an effective tool for hyperparameter optimization,but also a useful feature selection algorithm to improve prediction accuracies of debris flow susceptibility by using machine learning algorithms.The high and very high debris flow susceptibility zone appropriately covers 38.01%of the study area,where debris flow may occur under intensive human activities and heavy rainfall events.展开更多
The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
Algorithms for steganography are methods of hiding data transfers in media files.Several machine learning architectures have been presented recently to improve stego image identification performance by using spatial i...Algorithms for steganography are methods of hiding data transfers in media files.Several machine learning architectures have been presented recently to improve stego image identification performance by using spatial information,and these methods have made it feasible to handle a wide range of problems associated with image analysis.Images with little information or low payload are used by information embedding methods,but the goal of all contemporary research is to employ high-payload images for classification.To address the need for both low-and high-payload images,this work provides a machine-learning approach to steganography image classification that uses Curvelet transformation to efficiently extract characteristics from both type of images.Support Vector Machine(SVM),a commonplace classification technique,has been employed to determine whether the image is a stego or cover.The Wavelet Obtained Weights(WOW),Spatial Universal Wavelet Relative Distortion(S-UNIWARD),Highly Undetectable Steganography(HUGO),and Minimizing the Power of Optimal Detector(MiPOD)steganography techniques are used in a variety of experimental scenarios to evaluate the performance of the proposedmethod.Using WOW at several payloads,the proposed approach proves its classification accuracy of 98.60%.It exhibits its superiority over SOTA methods.展开更多
AIM:To develop a classifier for traditional Chinese medicine(TCM)syndrome differentiation of diabetic retinopathy(DR),using optimized machine learning algorithms,which can provide the basis for TCM objective and intel...AIM:To develop a classifier for traditional Chinese medicine(TCM)syndrome differentiation of diabetic retinopathy(DR),using optimized machine learning algorithms,which can provide the basis for TCM objective and intelligent syndrome differentiation.METHODS:Collated data on real-world DR cases were collected.A variety of machine learning methods were used to construct TCM syndrome classification model,and the best performance was selected as the basic model.Genetic Algorithm(GA)was used for feature selection to obtain the optimal feature combination.Harris Hawk Optimization(HHO)was used for parameter optimization,and a classification model based on feature selection and parameter optimization was constructed.The performance of the model was compared with other optimization algorithms.The models were evaluated with accuracy,precision,recall,and F1 score as indicators.RESULTS:Data on 970 cases that met screening requirements were collected.Support Vector Machine(SVM)was the best basic classification model.The accuracy rate of the model was 82.05%,the precision rate was 82.34%,the recall rate was 81.81%,and the F1 value was 81.76%.After GA screening,the optimal feature combination contained 37 feature values,which was consistent with TCM clinical practice.The model based on optimal combination and SVM(GA_SVM)had an accuracy improvement of 1.92%compared to the basic classifier.SVM model based on HHO and GA optimization(HHO_GA_SVM)had the best performance and convergence speed compared with other optimization algorithms.Compared with the basic classification model,the accuracy was improved by 3.51%.CONCLUSION:HHO and GA optimization can improve the model performance of SVM in TCM syndrome differentiation of DR.It provides a new method and research idea for TCM intelligent assisted syndrome differentiation.展开更多
The turbidite channel of South China Sea has been highly concerned.Influenced by the complex fault and the rapid phase change of lithofacies,predicting the channel through conventional seismic attributes is not accura...The turbidite channel of South China Sea has been highly concerned.Influenced by the complex fault and the rapid phase change of lithofacies,predicting the channel through conventional seismic attributes is not accurate enough.In response to this disadvantage,this study used a method combining grey relational analysis(GRA)and support vectormachine(SVM)and established a set of prediction technical procedures suitable for reservoirs with complex geological conditions.In the case study of the Huangliu Formation in Qiongdongnan Basin,South China Sea,this study first dimensionalized the conventional seismic attributes of Gas Layer Group I and then used the GRA method to obtain the main relational factors.A higher relational degree indicates a higher probability of responding to the attributes of the turbidite channel.This study then accumulated the optimized attributes with the highest relational factors to obtain a first-order accumulated sequence,which was used as the input training sample of the SVM model,thus successfully constructing the SVM turbidite channel model.Drilling results prove that the GRA-SVMmethod has a high drilling coincidence rate.Utilizing the core and logging data and taking full use of the advantages of seismic inversion in predicting the sand boundary of water channels,this study divides the sedimentary microfacies of the Huangliu Formation in the Lingshui 17-2 Gas Field.This comprehensive study has shown that the GRA-SVM method has high accuracy for predicting turbidite channels and can be used as a superior turbidite channel prediction method under complex geological conditions.展开更多
This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to...This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.展开更多
One-class classification problem has become a popular problem in many fields, with a wide range of applications in anomaly detection, fault diagnosis, and face recognition. We investigate the one-class classification ...One-class classification problem has become a popular problem in many fields, with a wide range of applications in anomaly detection, fault diagnosis, and face recognition. We investigate the one-class classification problem for second-order tensor data. Traditional vector-based one-class classification methods such as one-class support vector machine (OCSVM) and least squares one-class support vector machine (LSOCSVM) have limitations when tensor is used as input data, so we propose a new tensor one-class classification method, LSOCSTM, which directly uses tensor as input data. On one hand, using tensor as input data not only enables to classify tensor data, but also for vector data, classifying it after high dimensionalizing it into tensor still improves the classification accuracy and overcomes the over-fitting problem. On the other hand, different from one-class support tensor machine (OCSTM), we use squared loss instead of the original loss function so that we solve a series of linear equations instead of quadratic programming problems. Therefore, we use the distance to the hyperplane as a metric for classification, and the proposed method is more accurate and faster compared to existing methods. The experimental results show the high efficiency of the proposed method compared with several state-of-the-art methods.展开更多
A prediction control algorithm is presented based on least squares support vector machines (LS-SVM) model for a class of complex systems with strong nonlinearity. The nonlinear off-line model of the controlled plant i...A prediction control algorithm is presented based on least squares support vector machines (LS-SVM) model for a class of complex systems with strong nonlinearity. The nonlinear off-line model of the controlled plant is built by LS-SVM with radial basis function (RBF) kernel. In the process of system running, the off-line model is linearized at each sampling instant, and the generalized prediction control (GPC) algorithm is employed to implement the prediction control for the controlled plant. The obtained algorithm is applied to a boiler temperature control system with complicated nonlinearity and large time delay. The results of the experiment verify the effectiveness and merit of the algorithm.展开更多
In this paper, we propose a multidimensional version of recurrent least squares support vector machines (MDRLS- SVM) to solve the problem about the prediction of chaotic system. To acquire better prediction performa...In this paper, we propose a multidimensional version of recurrent least squares support vector machines (MDRLS- SVM) to solve the problem about the prediction of chaotic system. To acquire better prediction performance, the high-dimensional space, which provides more information on the system than the scalar time series, is first reconstructed utilizing Takens's embedding theorem. Then the MDRLS-SVM instead of traditional RLS-SVM is used in the high- dimensional space, and the prediction performance can be improved from the point of view of reconstructed embedding phase space. In addition, the MDRLS-SVM algorithm is analysed in the context of noise, and we also find that the MDRLS-SVM has lower sensitivity to noise than the RLS-SVM.展开更多
A sparse approximation algorithm based on projection is presented in this paper in order to overcome the limitation of the non-sparsity of least squares support vector machines (LS-SVM). The new inputs are projected...A sparse approximation algorithm based on projection is presented in this paper in order to overcome the limitation of the non-sparsity of least squares support vector machines (LS-SVM). The new inputs are projected into the subspace spanned by previous basis vectors (BV) and those inputs whose squared distance from the subspace is higher than a threshold are added in the BV set, while others are rejected. This consequently results in the sparse approximation. In addition, a recursive approach to deleting an exiting vector in the BV set is proposed. Then the online LS-SVM, sparse approximation and BV removal are combined to produce the sparse online LS-SVM algorithm that can control the size of memory irrespective of the processed data size. The suggested algorithm is applied in the online modeling of a pH neutralizing process and the isomerization plant of a refinery, respectively. The detailed comparison of computing time and precision is also given between the suggested algorithm and the nonsparse one. The results show that the proposed algorithm greatly improves the sparsity just with little cost of precision.展开更多
Used for industrial process with different degree of nonlinearity, the two predictive control algorithms presented in this paper are based on Least Squares Support Vector Machines (LS-SVM) model. For the weakly nonlin...Used for industrial process with different degree of nonlinearity, the two predictive control algorithms presented in this paper are based on Least Squares Support Vector Machines (LS-SVM) model. For the weakly nonlinear system, the system model is built by using LS-SVM with linear kernel function, and then the obtained linear LS-SVM model is transformed into linear input-output relation of the controlled system. However, for the strongly nonlinear system, the off-line model of the controlled system is built by using LS-SVM with Radial Basis Function (RBF) kernel. The obtained nonlinear LS-SVM model is linearized at each sampling instant of system running, after which the on-line linear input-output model of the system is built. Based on the obtained linear input-output model, the Generalized Predictive Control (GPC) algorithm is employed to implement predictive control for the controlled plant in both algorithms. The simulation results after the presented algorithms were implemented in two different industrial processes model; respectively revealed the effectiveness and merit of both algorithms.展开更多
This study explores the least square support vector and wavelet technique (WLSSVM) in the monthly stream flow forecasting. This is a new hybrid technique. The 30 days periodic predicting statistics used in this study ...This study explores the least square support vector and wavelet technique (WLSSVM) in the monthly stream flow forecasting. This is a new hybrid technique. The 30 days periodic predicting statistics used in this study are derived from the subjection of this model to the river flow data of the Jhelum and Chenab rivers. The root mean square error (RMSE), mean absolute error (RME) and correlation (R) statistics are used for evaluating the accuracy of the WLSSVM and WR models. The accuracy of the WLSSVM model is compared with LSSVM, WR and LR models. The two rivers surveyed are in the Republic of Pakistan and cover an area encompassing 39,200 km2 for the Jhelum River and 67,515 km2 for the Chenab River. Using discrete wavelets, the observed data has been decomposed into sub-series. These have then appropriately been used as inputs in the least square support vector machines for forecasting the hydrological variables. The resultant observation from this comparison indicates the WLSSVM is more accurate than the LSSVM, WR and LR models in river flow forecasting.展开更多
A multi-layer adaptive optimizing parameters algorithm is developed forimproving least squares support vector machines (LS-SVM) , and a military aircraft life-cycle-cost(LCC) intelligent estimation model is proposed b...A multi-layer adaptive optimizing parameters algorithm is developed forimproving least squares support vector machines (LS-SVM) , and a military aircraft life-cycle-cost(LCC) intelligent estimation model is proposed based on the improved LS-SVM. The intelligent costestimation process is divided into three steps in the model. In the first step, a cost-drive-factorneeds to be selected, which is significant for cost estimation. In the second step, militaryaircraft training samples within costs and cost-drive-factor set are obtained by the LS-SVM. Thenthe model can be used for new type aircraft cost estimation. Chinese military aircraft costs areestimated in the paper. The results show that the estimated costs by the new model are closer to thetrue costs than that of the traditionally used methods.展开更多
The Least Squares Support Vector Machines (LS-SVM) is an improvement to the SVM. Combined the LS-SVM with the Multi-Resolution Analysis (MRA),this letter proposes the Multi-resolution LS-SVM (MLS-SVM).The proposed alg...The Least Squares Support Vector Machines (LS-SVM) is an improvement to the SVM. Combined the LS-SVM with the Multi-Resolution Analysis (MRA),this letter proposes the Multi-resolution LS-SVM (MLS-SVM).The proposed algorithm has the same theoretical framework as MRA but with better approximation ability.At a fixed scale MLS-SVM is a classical LS-SVM,but MLS-SVM can gradually approximate the target function at different scales.In experiments,the MLS-SVM is used for nonlinear system identification,and achieves better identification accuracy.展开更多
A new strategy for noise reduction of fast fading channel is presented. Firstly, more information is acquired utilizing the reconstructed embedding phase space. Then, based on the Recurrent Least Squares Sup-port Vect...A new strategy for noise reduction of fast fading channel is presented. Firstly, more information is acquired utilizing the reconstructed embedding phase space. Then, based on the Recurrent Least Squares Sup-port Vector Machines (RLS-SVM), noise reduction of the fast fading channel is realized. This filtering tech-nique does not make use of the spectral contents of the signal. Based on the stability and the fractal of the cha-otic attractor, the RLS-SVM algorithm is a better candidate for the nonlinear time series noise-reduction. The simulation results shows that better noise-reduction performance is acquired when the signal to noise ratio is 12dB.展开更多
Hard rock pillar is one of the important structures in engineering design and excavation in underground mines.Accurate and convenient prediction of pillar stability is of great significance for underground space safet...Hard rock pillar is one of the important structures in engineering design and excavation in underground mines.Accurate and convenient prediction of pillar stability is of great significance for underground space safety.This paper aims to develop hybrid support vector machine(SVM)models improved by three metaheuristic algorithms known as grey wolf optimizer(GWO),whale optimization algorithm(WOA)and sparrow search algorithm(SSA)for predicting the hard rock pillar stability.An integrated dataset containing 306 hard rock pillars was established to generate hybrid SVM models.Five parameters including pillar height,pillar width,ratio of pillar width to height,uniaxial compressive strength and pillar stress were set as input parameters.Two global indices,three local indices and the receiver operating characteristic(ROC)curve with the area under the ROC curve(AUC)were utilized to evaluate all hybrid models’performance.The results confirmed that the SSA-SVM model is the best prediction model with the highest values of all global indices and local indices.Nevertheless,the performance of the SSASVM model for predicting the unstable pillar(AUC:0.899)is not as good as those for stable(AUC:0.975)and failed pillars(AUC:0.990).To verify the effectiveness of the proposed models,5 field cases were investigated in a metal mine and other 5 cases were collected from several published works.The validation results indicated that the SSA-SVM model obtained a considerable accuracy,which means that the combination of SVM and metaheuristic algorithms is a feasible approach to predict the pillar stability.展开更多
Lung cancer is the most dangerous and death-causing disease indicated by the presence of pulmonary nodules in the lung.It is mostly caused by the instinctive growth of cells in the lung.Lung nodule detection has a sig...Lung cancer is the most dangerous and death-causing disease indicated by the presence of pulmonary nodules in the lung.It is mostly caused by the instinctive growth of cells in the lung.Lung nodule detection has a significant role in detecting and screening lung cancer in Computed tomography(CT)scan images.Early detection plays an important role in the survival rate and treatment of lung cancer patients.Moreover,pulmonary nodule classification techniques based on the convolutional neural network can be used for the accurate and efficient detection of lung cancer.This work proposed an automatic nodule detection method in CT images based on modified AlexNet architecture and Support vector machine(SVM)algorithm namely LungNet-SVM.The proposed model consists of seven convolutional layers,three pooling layers,and two fully connected layers used to extract features.Support vector machine classifier is applied for the binary classification of nodules into benign andmalignant.The experimental analysis is performed by using the publicly available benchmark dataset Lung nodule analysis 2016(LUNA16).The proposed model has achieved 97.64%of accuracy,96.37%of sensitivity,and 99.08%of specificity.A comparative analysis has been carried out between the proposed LungNet-SVM model and existing stateof-the-art approaches for the classification of lung cancer.The experimental results indicate that the proposed LungNet-SVM model achieved remarkable performance on a LUNA16 dataset in terms of accuracy.展开更多
In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying result...In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.展开更多
基金supported in part by National Natural Science Foundation of China(Nos.62102311,62202377,62272385)in part by Natural Science Basic Research Program of Shaanxi(Nos.2022JQ-600,2022JM-353,2023-JC-QN-0327)+2 种基金in part by Shaanxi Distinguished Youth Project(No.2022JC-47)in part by Scientific Research Program Funded by Shaanxi Provincial Education Department(No.22JK0560)in part by Distinguished Youth Talents of Shaanxi Universities,and in part by Youth Innovation Team of Shaanxi Universities.
文摘With the widespread data collection and processing,privacy-preserving machine learning has become increasingly important in addressing privacy risks related to individuals.Support vector machine(SVM)is one of the most elementary learning models of machine learning.Privacy issues surrounding SVM classifier training have attracted increasing attention.In this paper,we investigate Differential Privacy-compliant Federated Machine Learning with Dimensionality Reduction,called FedDPDR-DPML,which greatly improves data utility while providing strong privacy guarantees.Considering in distributed learning scenarios,multiple participants usually hold unbalanced or small amounts of data.Therefore,FedDPDR-DPML enables multiple participants to collaboratively learn a global model based on weighted model averaging and knowledge aggregation and then the server distributes the global model to each participant to improve local data utility.Aiming at high-dimensional data,we adopt differential privacy in both the principal component analysis(PCA)-based dimensionality reduction phase and SVM classifiers training phase,which improves model accuracy while achieving strict differential privacy protection.Besides,we train Differential privacy(DP)-compliant SVM classifiers by adding noise to the objective function itself,thus leading to better data utility.Extensive experiments on three high-dimensional datasets demonstrate that FedDPDR-DPML can achieve high accuracy while ensuring strong privacy protection.
基金supported by the Second Tibetan Plateau Scientific Expedition and Research Program(Grant no.2019QZKK0904)Natural Science Foundation of Hebei Province(Grant no.D2022403032)S&T Program of Hebei(Grant no.E2021403001).
文摘The selection of important factors in machine learning-based susceptibility assessments is crucial to obtain reliable susceptibility results.In this study,metaheuristic optimization and feature selection techniques were applied to identify the most important input parameters for mapping debris flow susceptibility in the southern mountain area of Chengde City in Hebei Province,China,by using machine learning algorithms.In total,133 historical debris flow records and 16 related factors were selected.The support vector machine(SVM)was first used as the base classifier,and then a hybrid model was introduced by a two-step process.First,the particle swarm optimization(PSO)algorithm was employed to select the SVM model hyperparameters.Second,two feature selection algorithms,namely principal component analysis(PCA)and PSO,were integrated into the PSO-based SVM model,which generated the PCA-PSO-SVM and FS-PSO-SVM models,respectively.Three statistical metrics(accuracy,recall,and specificity)and the area under the receiver operating characteristic curve(AUC)were employed to evaluate and validate the performance of the models.The results indicated that the feature selection-based models exhibited the best performance,followed by the PSO-based SVM and SVM models.Moreover,the performance of the FS-PSO-SVM model was better than that of the PCA-PSO-SVM model,showing the highest AUC,accuracy,recall,and specificity values in both the training and testing processes.It was found that the selection of optimal features is crucial to improving the reliability of debris flow susceptibility assessment results.Moreover,the PSO algorithm was found to be not only an effective tool for hyperparameter optimization,but also a useful feature selection algorithm to improve prediction accuracies of debris flow susceptibility by using machine learning algorithms.The high and very high debris flow susceptibility zone appropriately covers 38.01%of the study area,where debris flow may occur under intensive human activities and heavy rainfall events.
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.
基金financially supported by the Deanship of Scientific Research at King Khalid University under Research Grant Number(R.G.P.2/549/44).
文摘Algorithms for steganography are methods of hiding data transfers in media files.Several machine learning architectures have been presented recently to improve stego image identification performance by using spatial information,and these methods have made it feasible to handle a wide range of problems associated with image analysis.Images with little information or low payload are used by information embedding methods,but the goal of all contemporary research is to employ high-payload images for classification.To address the need for both low-and high-payload images,this work provides a machine-learning approach to steganography image classification that uses Curvelet transformation to efficiently extract characteristics from both type of images.Support Vector Machine(SVM),a commonplace classification technique,has been employed to determine whether the image is a stego or cover.The Wavelet Obtained Weights(WOW),Spatial Universal Wavelet Relative Distortion(S-UNIWARD),Highly Undetectable Steganography(HUGO),and Minimizing the Power of Optimal Detector(MiPOD)steganography techniques are used in a variety of experimental scenarios to evaluate the performance of the proposedmethod.Using WOW at several payloads,the proposed approach proves its classification accuracy of 98.60%.It exhibits its superiority over SOTA methods.
基金Supported by Hunan Province Traditional Chinese Medicine Research Project(No.B2023043)Hunan Provincial Department of Education Scientific Research Project(No.22B0386)Hunan University of Traditional Chinese Medicine Campus level Research Fund Project(No.2022XJZKC004).
文摘AIM:To develop a classifier for traditional Chinese medicine(TCM)syndrome differentiation of diabetic retinopathy(DR),using optimized machine learning algorithms,which can provide the basis for TCM objective and intelligent syndrome differentiation.METHODS:Collated data on real-world DR cases were collected.A variety of machine learning methods were used to construct TCM syndrome classification model,and the best performance was selected as the basic model.Genetic Algorithm(GA)was used for feature selection to obtain the optimal feature combination.Harris Hawk Optimization(HHO)was used for parameter optimization,and a classification model based on feature selection and parameter optimization was constructed.The performance of the model was compared with other optimization algorithms.The models were evaluated with accuracy,precision,recall,and F1 score as indicators.RESULTS:Data on 970 cases that met screening requirements were collected.Support Vector Machine(SVM)was the best basic classification model.The accuracy rate of the model was 82.05%,the precision rate was 82.34%,the recall rate was 81.81%,and the F1 value was 81.76%.After GA screening,the optimal feature combination contained 37 feature values,which was consistent with TCM clinical practice.The model based on optimal combination and SVM(GA_SVM)had an accuracy improvement of 1.92%compared to the basic classifier.SVM model based on HHO and GA optimization(HHO_GA_SVM)had the best performance and convergence speed compared with other optimization algorithms.Compared with the basic classification model,the accuracy was improved by 3.51%.CONCLUSION:HHO and GA optimization can improve the model performance of SVM in TCM syndrome differentiation of DR.It provides a new method and research idea for TCM intelligent assisted syndrome differentiation.
基金grateful for Science and Technology Innovation Ability Cultivation Project of Hebei Provincial Planning for College and Middle School Students(22E50590D)Priority Research Project of Langfang Education Sciences Planning(JCJY202130).
文摘The turbidite channel of South China Sea has been highly concerned.Influenced by the complex fault and the rapid phase change of lithofacies,predicting the channel through conventional seismic attributes is not accurate enough.In response to this disadvantage,this study used a method combining grey relational analysis(GRA)and support vectormachine(SVM)and established a set of prediction technical procedures suitable for reservoirs with complex geological conditions.In the case study of the Huangliu Formation in Qiongdongnan Basin,South China Sea,this study first dimensionalized the conventional seismic attributes of Gas Layer Group I and then used the GRA method to obtain the main relational factors.A higher relational degree indicates a higher probability of responding to the attributes of the turbidite channel.This study then accumulated the optimized attributes with the highest relational factors to obtain a first-order accumulated sequence,which was used as the input training sample of the SVM model,thus successfully constructing the SVM turbidite channel model.Drilling results prove that the GRA-SVMmethod has a high drilling coincidence rate.Utilizing the core and logging data and taking full use of the advantages of seismic inversion in predicting the sand boundary of water channels,this study divides the sedimentary microfacies of the Huangliu Formation in the Lingshui 17-2 Gas Field.This comprehensive study has shown that the GRA-SVM method has high accuracy for predicting turbidite channels and can be used as a superior turbidite channel prediction method under complex geological conditions.
文摘This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.
文摘One-class classification problem has become a popular problem in many fields, with a wide range of applications in anomaly detection, fault diagnosis, and face recognition. We investigate the one-class classification problem for second-order tensor data. Traditional vector-based one-class classification methods such as one-class support vector machine (OCSVM) and least squares one-class support vector machine (LSOCSVM) have limitations when tensor is used as input data, so we propose a new tensor one-class classification method, LSOCSTM, which directly uses tensor as input data. On one hand, using tensor as input data not only enables to classify tensor data, but also for vector data, classifying it after high dimensionalizing it into tensor still improves the classification accuracy and overcomes the over-fitting problem. On the other hand, different from one-class support tensor machine (OCSTM), we use squared loss instead of the original loss function so that we solve a series of linear equations instead of quadratic programming problems. Therefore, we use the distance to the hyperplane as a metric for classification, and the proposed method is more accurate and faster compared to existing methods. The experimental results show the high efficiency of the proposed method compared with several state-of-the-art methods.
基金Supported by the National Creative Research Groups Science Foundation of P.R. China (NCRGSFC: 60421002) and National High Technology Research and Development Program of China (863 Program) (2006AA04 Z182)
基金This work has been supported by the National Outstanding Youth Science Foundation of China (No. 60025308) and the Teach and Research Award Program for Outstanding Young Teachers in Higher Education Institutions of MOE,China.
文摘A prediction control algorithm is presented based on least squares support vector machines (LS-SVM) model for a class of complex systems with strong nonlinearity. The nonlinear off-line model of the controlled plant is built by LS-SVM with radial basis function (RBF) kernel. In the process of system running, the off-line model is linearized at each sampling instant, and the generalized prediction control (GPC) algorithm is employed to implement the prediction control for the controlled plant. The obtained algorithm is applied to a boiler temperature control system with complicated nonlinearity and large time delay. The results of the experiment verify the effectiveness and merit of the algorithm.
基金Project supported by the National Natural Science Foundation of China (Grant No 90207012).
文摘In this paper, we propose a multidimensional version of recurrent least squares support vector machines (MDRLS- SVM) to solve the problem about the prediction of chaotic system. To acquire better prediction performance, the high-dimensional space, which provides more information on the system than the scalar time series, is first reconstructed utilizing Takens's embedding theorem. Then the MDRLS-SVM instead of traditional RLS-SVM is used in the high- dimensional space, and the prediction performance can be improved from the point of view of reconstructed embedding phase space. In addition, the MDRLS-SVM algorithm is analysed in the context of noise, and we also find that the MDRLS-SVM has lower sensitivity to noise than the RLS-SVM.
基金supported by the National Creative Research Groups Science Foundation of China (NCRGSFC:60721062)National Basic Research Program of China (973 Program) (No.2007CB714000)
文摘A sparse approximation algorithm based on projection is presented in this paper in order to overcome the limitation of the non-sparsity of least squares support vector machines (LS-SVM). The new inputs are projected into the subspace spanned by previous basis vectors (BV) and those inputs whose squared distance from the subspace is higher than a threshold are added in the BV set, while others are rejected. This consequently results in the sparse approximation. In addition, a recursive approach to deleting an exiting vector in the BV set is proposed. Then the online LS-SVM, sparse approximation and BV removal are combined to produce the sparse online LS-SVM algorithm that can control the size of memory irrespective of the processed data size. The suggested algorithm is applied in the online modeling of a pH neutralizing process and the isomerization plant of a refinery, respectively. The detailed comparison of computing time and precision is also given between the suggested algorithm and the nonsparse one. The results show that the proposed algorithm greatly improves the sparsity just with little cost of precision.
基金Project supported by the National Outstanding Youth ScienceFoundation of China (No. 60025308) and the Teach and ResearchAward Program for Outstanding Young Teachers in Higher EducationInstitutions of MOE, China
文摘Used for industrial process with different degree of nonlinearity, the two predictive control algorithms presented in this paper are based on Least Squares Support Vector Machines (LS-SVM) model. For the weakly nonlinear system, the system model is built by using LS-SVM with linear kernel function, and then the obtained linear LS-SVM model is transformed into linear input-output relation of the controlled system. However, for the strongly nonlinear system, the off-line model of the controlled system is built by using LS-SVM with Radial Basis Function (RBF) kernel. The obtained nonlinear LS-SVM model is linearized at each sampling instant of system running, after which the on-line linear input-output model of the system is built. Based on the obtained linear input-output model, the Generalized Predictive Control (GPC) algorithm is employed to implement predictive control for the controlled plant in both algorithms. The simulation results after the presented algorithms were implemented in two different industrial processes model; respectively revealed the effectiveness and merit of both algorithms.
文摘This study explores the least square support vector and wavelet technique (WLSSVM) in the monthly stream flow forecasting. This is a new hybrid technique. The 30 days periodic predicting statistics used in this study are derived from the subjection of this model to the river flow data of the Jhelum and Chenab rivers. The root mean square error (RMSE), mean absolute error (RME) and correlation (R) statistics are used for evaluating the accuracy of the WLSSVM and WR models. The accuracy of the WLSSVM model is compared with LSSVM, WR and LR models. The two rivers surveyed are in the Republic of Pakistan and cover an area encompassing 39,200 km2 for the Jhelum River and 67,515 km2 for the Chenab River. Using discrete wavelets, the observed data has been decomposed into sub-series. These have then appropriately been used as inputs in the least square support vector machines for forecasting the hydrological variables. The resultant observation from this comparison indicates the WLSSVM is more accurate than the LSSVM, WR and LR models in river flow forecasting.
文摘A multi-layer adaptive optimizing parameters algorithm is developed forimproving least squares support vector machines (LS-SVM) , and a military aircraft life-cycle-cost(LCC) intelligent estimation model is proposed based on the improved LS-SVM. The intelligent costestimation process is divided into three steps in the model. In the first step, a cost-drive-factorneeds to be selected, which is significant for cost estimation. In the second step, militaryaircraft training samples within costs and cost-drive-factor set are obtained by the LS-SVM. Thenthe model can be used for new type aircraft cost estimation. Chinese military aircraft costs areestimated in the paper. The results show that the estimated costs by the new model are closer to thetrue costs than that of the traditionally used methods.
文摘The Least Squares Support Vector Machines (LS-SVM) is an improvement to the SVM. Combined the LS-SVM with the Multi-Resolution Analysis (MRA),this letter proposes the Multi-resolution LS-SVM (MLS-SVM).The proposed algorithm has the same theoretical framework as MRA but with better approximation ability.At a fixed scale MLS-SVM is a classical LS-SVM,but MLS-SVM can gradually approximate the target function at different scales.In experiments,the MLS-SVM is used for nonlinear system identification,and achieves better identification accuracy.
基金Supported by the National Natural Science Foundation of China (No.60102005).
文摘A new strategy for noise reduction of fast fading channel is presented. Firstly, more information is acquired utilizing the reconstructed embedding phase space. Then, based on the Recurrent Least Squares Sup-port Vector Machines (RLS-SVM), noise reduction of the fast fading channel is realized. This filtering tech-nique does not make use of the spectral contents of the signal. Based on the stability and the fractal of the cha-otic attractor, the RLS-SVM algorithm is a better candidate for the nonlinear time series noise-reduction. The simulation results shows that better noise-reduction performance is acquired when the signal to noise ratio is 12dB.
基金supported by the National Natural Science Foundation Project of China(Nos.72088101 and 42177164)the Distinguished Youth Science Foundation of Hunan Province of China(No.2022JJ10073)The first author was funded by China Scholarship Council(No.202106370038).
文摘Hard rock pillar is one of the important structures in engineering design and excavation in underground mines.Accurate and convenient prediction of pillar stability is of great significance for underground space safety.This paper aims to develop hybrid support vector machine(SVM)models improved by three metaheuristic algorithms known as grey wolf optimizer(GWO),whale optimization algorithm(WOA)and sparrow search algorithm(SSA)for predicting the hard rock pillar stability.An integrated dataset containing 306 hard rock pillars was established to generate hybrid SVM models.Five parameters including pillar height,pillar width,ratio of pillar width to height,uniaxial compressive strength and pillar stress were set as input parameters.Two global indices,three local indices and the receiver operating characteristic(ROC)curve with the area under the ROC curve(AUC)were utilized to evaluate all hybrid models’performance.The results confirmed that the SSA-SVM model is the best prediction model with the highest values of all global indices and local indices.Nevertheless,the performance of the SSASVM model for predicting the unstable pillar(AUC:0.899)is not as good as those for stable(AUC:0.975)and failed pillars(AUC:0.990).To verify the effectiveness of the proposed models,5 field cases were investigated in a metal mine and other 5 cases were collected from several published works.The validation results indicated that the SSA-SVM model obtained a considerable accuracy,which means that the combination of SVM and metaheuristic algorithms is a feasible approach to predict the pillar stability.
文摘Lung cancer is the most dangerous and death-causing disease indicated by the presence of pulmonary nodules in the lung.It is mostly caused by the instinctive growth of cells in the lung.Lung nodule detection has a significant role in detecting and screening lung cancer in Computed tomography(CT)scan images.Early detection plays an important role in the survival rate and treatment of lung cancer patients.Moreover,pulmonary nodule classification techniques based on the convolutional neural network can be used for the accurate and efficient detection of lung cancer.This work proposed an automatic nodule detection method in CT images based on modified AlexNet architecture and Support vector machine(SVM)algorithm namely LungNet-SVM.The proposed model consists of seven convolutional layers,three pooling layers,and two fully connected layers used to extract features.Support vector machine classifier is applied for the binary classification of nodules into benign andmalignant.The experimental analysis is performed by using the publicly available benchmark dataset Lung nodule analysis 2016(LUNA16).The proposed model has achieved 97.64%of accuracy,96.37%of sensitivity,and 99.08%of specificity.A comparative analysis has been carried out between the proposed LungNet-SVM model and existing stateof-the-art approaches for the classification of lung cancer.The experimental results indicate that the proposed LungNet-SVM model achieved remarkable performance on a LUNA16 dataset in terms of accuracy.
基金Project supported by the National Basic Research Program (973) of China (No. 2002CB312200) and the Center for Bioinformatics Pro-gram Grant of Harvard Center of Neurodegeneration and Repair,Harvard Medical School, Harvard University, Boston, USA
文摘In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.