The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabe...In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabeled samples. In S2TSVM, the addition of unlabeled samples can easily cause the classification hyper plane to deviate from the sample points. Then a centerdistance principle is proposed to pre-classify unlabeled samples, and a pre-classified S2TSVM (PS2TSVM) is proposed. Compared with S2TSVM, PS2TSVM not only improves the problem of the samples deviating from the classification hyper plane, but also improves the training speed. Then PS2TSVM is smoothed. After smoothing the model, the pre-classified smooth S2TSVM (PS3TSVM) is obtained, and its convergence is deduced. Finally, nine datasets are selected in the UCI machine learning database for comparison with other types of semi-supervised models. The experimental results show that the proposed PS3TSVM model has better classification results.展开更多
With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly af...With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly affect the performance of the entire network. Traditional processing methods include classification models such as fully connected network models and support vector machines. In order to solve the problem that the traditional convolutional neural network is prone to over-fitting for the classification of small samples, a CNN-TWSVM hybrid model was proposed by fusing the twin support vector machine (TWSVM) with higher computational efficiency as the CNN classifier, and it was applied to the traffic sign recognition task. In order to improve the generalization ability of the model, the wavelet kernel function is introduced to deal with the nonlinear classification task. The method uses the network initialized from the ImageNet dataset to fine-tune the specific domain and intercept the inner layer of the network to extract the high abstract features of the traffic sign image. Finally, the TWSVM based on wavelet kernel function is used to identify the traffic signs, so as to effectively solve the over-fitting problem of traffic signs classification. On GTSRB and BELGIUMTS datasets, the validity and generalization ability of the improved model is verified by comparing with different kernel functions and different SVM classifiers.展开更多
Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonab...Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.展开更多
This project is mainly focused to develop system for animal researchers & wild life photographers to overcome so many challenges in their day life today. When they engage in such situation, they need to be patient...This project is mainly focused to develop system for animal researchers & wild life photographers to overcome so many challenges in their day life today. When they engage in such situation, they need to be patiently waiting for long hours, maybe several days in whatever location and under severe weather conditions until capturing what they are interested in. Also there is a big demand for rare wild life photo graphs. The proposed method makes the task automatically use microcontroller controlled camera, image processing and machine learning techniques. First with the aid of microcontroller and four passive IR sensors system will automatically detect the presence of animal and rotate the camera toward that direction. Then the motion detection algorithm will get the animal into middle of the frame and capture by high end auto focus web cam. Then the captured images send to the PC and are compared with photograph database to check whether the animal is exactly the same as the photographer choice. If that captured animal is the exactly one who need to capture then it will automatically capture more. Though there are several technologies available none of these are capable of recognizing what it captures. There is no detection of animal presence in different angles. Most of available equipment uses a set of PIR sensors and whatever it disturbs the IR field will automatically be captured and stored. Night time images are black and white and have less details and clarity due to infrared flash quality. If the infrared flash is designed for best image quality, range will be sacrificed. The photographer might be interested in a specific animal but there is no facility to recognize automatically whether captured animal is the photographer’s choice or not.展开更多
Accurately estimating the State of Health(SOH)and Remaining Useful Life(RUL)of lithium-ion batteries(LIBs)is crucial for the continuous and stable operation of battery management systems.However,due to the complex int...Accurately estimating the State of Health(SOH)and Remaining Useful Life(RUL)of lithium-ion batteries(LIBs)is crucial for the continuous and stable operation of battery management systems.However,due to the complex internal chemical systems of LIBs and the nonlinear degradation of their performance,direct measurement of SOH and RUL is challenging.To address these issues,the Twin Support Vector Machine(TWSVM)method is proposed to predict SOH and RUL.Initially,the constant current charging time of the lithium battery is extracted as a health indicator(HI),decomposed using Variational Modal Decomposition(VMD),and feature correlations are computed using Importance of Random Forest Features(RF)to maximize the extraction of critical factors influencing battery performance degradation.Furthermore,to enhance the global search capability of the Convolution Optimization Algorithm(COA),improvements are made using Good Point Set theory and the Differential Evolution method.The Improved Convolution Optimization Algorithm(ICOA)is employed to optimize TWSVM parameters for constructing SOH and RUL prediction models.Finally,the proposed models are validated using NASA and CALCE lithium-ion battery datasets.Experimental results demonstrate that the proposed models achieve an RMSE not exceeding 0.007 and an MAPE not exceeding 0.0082 for SOH and RUL prediction,with a relative error in RUL prediction within the range of[-1.8%,2%].Compared to other models,the proposed model not only exhibits superior fitting capability but also demonstrates robust performance.展开更多
Geospatial objects detection within complex environment is a challenging problem in remote sensing area. In this paper, we derive an extension of the Relevance Vector Machine (RVM) technique to multiple kernel version...Geospatial objects detection within complex environment is a challenging problem in remote sensing area. In this paper, we derive an extension of the Relevance Vector Machine (RVM) technique to multiple kernel version. The proposed method learns an optimal kernel combination and the associated classifier simultaneously. Two feature types are extracted from images, forming basis kernels. Then these basis kernels are weighted combined and resulted the composite kernel exploits interesting points and appearance information of objects simultaneously. Weights and the detection model are finally learnt by a new algorithm. Experimental results show that the proposed method improve detection accuracy to above 88%, yields good interpretation for the selected subset of features and appears sparser than traditional single-kernel RVMs.展开更多
We improve the twin support vector machine(TWSVM)to be a novel nonparallel hyperplanes classifier,termed as ITSVM(improved twin support vector machine),for binary classification.By introducing the diferent Lagrangian ...We improve the twin support vector machine(TWSVM)to be a novel nonparallel hyperplanes classifier,termed as ITSVM(improved twin support vector machine),for binary classification.By introducing the diferent Lagrangian functions for the primal problems in the TWSVM,we get an improved dual formulation of TWSVM,then the resulted ITSVM algorithm overcomes the common drawbacks in the TWSVMs and inherits the essence of the standard SVMs.Firstly,ITSVM does not need to compute the large inverse matrices before training which is inevitable for the TWSVMs.Secondly,diferent from the TWSVMs,kernel trick can be applied directly to ITSVM for the nonlinear case,therefore nonlinear ITSVM is superior to nonlinear TWSVM theoretically.Thirdly,ITSVM can be solved efciently by the successive overrelaxation(SOR)technique or sequential minimization optimization(SMO)method,which makes it more suitable for large scale problems.We also prove that the standard SVM is the special case of ITSVM.Experimental results show the efciency of our method in both computation time and classification accuracy.展开更多
Intrusion detection system(IDS) is becoming a critical component of network security. However,the performance of many proposed intelligent intrusion detection models is still not competent to be applied to real networ...Intrusion detection system(IDS) is becoming a critical component of network security. However,the performance of many proposed intelligent intrusion detection models is still not competent to be applied to real network security. This paper aims to explore a novel and effective approach to significantly improve the performance of IDS. An intrusion detection model with twin support vector machines(TWSVMs) is proposed.In this model, an efficient algorithm is also proposed to determine the parameter of TWSVMs. The performance of the proposed intrusion detection model is evaluated with KDD'99 dataset and is compared with those of some recent intrusion detection models. The results demonstrate that the proposed intrusion detection model achieves remarkable improvement in intrusion detection rate and more balanced performance on each type of attacks.Moreover, TWSVMs consume much less training time than standard support vector machines(SVMs).展开更多
In this paper,a new quadratic kernel-free least square twin support vector machine(QLSTSVM)is proposed for binary classification problems.The advantage of QLSTSVM is that there is no need to select the kernel function...In this paper,a new quadratic kernel-free least square twin support vector machine(QLSTSVM)is proposed for binary classification problems.The advantage of QLSTSVM is that there is no need to select the kernel function and related parameters for nonlinear classification problems.After using consensus technique,we adopt alternating direction method of multipliers to solve the reformulated consensus QLSTSVM directly.To reduce CPU time,the Karush-Kuhn-Tucker(KKT)conditions is also used to solve the QLSTSVM.The performance of QLSTSVM is tested on two artificial datasets and several University of California Irvine(UCI)benchmark datasets.Numerical results indicate that the QLSTSVM may outperform several existing methods for solving twin support vector machine with Gaussian kernel in terms of the classification accuracy and operation time.展开更多
Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsi...Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.展开更多
For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic pro...For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic programming problems(QPPs),which makes LSTSVM much faster than the original TSVM.But the standard LSTSVM adopting quadratic loss measured by the minimal distance is sensitive to noise and unstable to re-sampling.To overcome this problem,the expectile distance is taken into consideration to measure the margin between classes and LSTSVM with asymmetric squared loss(aLSTSVM)is proposed.Compared to the original LSTSVM with the quadratic loss,the proposed aLSTSVM not only has comparable computational accuracy,but also performs good properties such as noise insensitivity,scatter minimization and re-sampling stability.Numerical experiments on synthetic datasets,normally distributed clustered(NDC)datasets and University of California,Irvine(UCI)datasets with different noises confirm the great performance and validity of our proposed algorithm.展开更多
In general,data contain noises which come from faulty instruments,flawed measurements or faulty communication.Learning with data in the context of classification or regression is inevitably affected by noises in the d...In general,data contain noises which come from faulty instruments,flawed measurements or faulty communication.Learning with data in the context of classification or regression is inevitably affected by noises in the data.In order to remove or greatly reduce the impact of noises,we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine(Lap-TSVM).A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine(IFLap-TSVM)is presented.Moreover,we extend the linear IFLap-TSVM to the nonlinear case by kernel function.The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classi-fier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization.Experiments with constructed artificial datasets,several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine(TSVM),intuitionistic fuzzy twin support vector machine(IFTSVM)and Lap-TSVM.展开更多
Robust minimum class variance twin support vector machine(RMCV-TWSVM) presented previously gets better classification performance than the classical TWSVM. The RMCV-TWSVM introduces the class variance matrix of positi...Robust minimum class variance twin support vector machine(RMCV-TWSVM) presented previously gets better classification performance than the classical TWSVM. The RMCV-TWSVM introduces the class variance matrix of positive and negative samples into the construction of two hyperplanes. However, it does not consider the total structure information of all the samples, which can substantially reduce its classification accuracy. In this paper, a new algorithm named structural regularized TWSVM based on within-class scatter and between-class scatter(WSBS-STWSVM) is put forward. The WSBS-STWSVM can make full use of the total within-class distribution information and between-class structure information of all the samples. The experimental results illustrate high classification accuracy and strong generalization ability of the proposed algorithm.展开更多
Classification of intrusion attacks and normal network flow is a critical and challenging issue in network security study. Many intelligent intrusion detection models are proposed, but their performances and efficienc...Classification of intrusion attacks and normal network flow is a critical and challenging issue in network security study. Many intelligent intrusion detection models are proposed, but their performances and efficiencies are not satisfied to real computer networks. This paper presents a novel effective intrusion detection system based on statistic reference model and twin support vector machines (TWSVMs). Moreover, a network flow feature selection procedure has been studied and implemented with TWSVMs. The performances of proposed system are evaluated through using the fifth international conference on knowledge discovery and data mining in 1999 (KDD'99) data set collected at MIT's Lincoln Labs and the results indicate that the proposed system is more efficient and effective than conventional support vector machines (SVMs) and TWSVMs.展开更多
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.
基金supported by the Fundamental Research Funds for University of Science and Technology Beijing(FRF-BR-12-021)
文摘In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabeled samples. In S2TSVM, the addition of unlabeled samples can easily cause the classification hyper plane to deviate from the sample points. Then a centerdistance principle is proposed to pre-classify unlabeled samples, and a pre-classified S2TSVM (PS2TSVM) is proposed. Compared with S2TSVM, PS2TSVM not only improves the problem of the samples deviating from the classification hyper plane, but also improves the training speed. Then PS2TSVM is smoothed. After smoothing the model, the pre-classified smooth S2TSVM (PS3TSVM) is obtained, and its convergence is deduced. Finally, nine datasets are selected in the UCI machine learning database for comparison with other types of semi-supervised models. The experimental results show that the proposed PS3TSVM model has better classification results.
文摘With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly affect the performance of the entire network. Traditional processing methods include classification models such as fully connected network models and support vector machines. In order to solve the problem that the traditional convolutional neural network is prone to over-fitting for the classification of small samples, a CNN-TWSVM hybrid model was proposed by fusing the twin support vector machine (TWSVM) with higher computational efficiency as the CNN classifier, and it was applied to the traffic sign recognition task. In order to improve the generalization ability of the model, the wavelet kernel function is introduced to deal with the nonlinear classification task. The method uses the network initialized from the ImageNet dataset to fine-tune the specific domain and intercept the inner layer of the network to extract the high abstract features of the traffic sign image. Finally, the TWSVM based on wavelet kernel function is used to identify the traffic signs, so as to effectively solve the over-fitting problem of traffic signs classification. On GTSRB and BELGIUMTS datasets, the validity and generalization ability of the improved model is verified by comparing with different kernel functions and different SVM classifiers.
基金Supported by the National Natural Science Foundation of China (No. 60771068)the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2007F248)
文摘Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.
文摘This project is mainly focused to develop system for animal researchers & wild life photographers to overcome so many challenges in their day life today. When they engage in such situation, they need to be patiently waiting for long hours, maybe several days in whatever location and under severe weather conditions until capturing what they are interested in. Also there is a big demand for rare wild life photo graphs. The proposed method makes the task automatically use microcontroller controlled camera, image processing and machine learning techniques. First with the aid of microcontroller and four passive IR sensors system will automatically detect the presence of animal and rotate the camera toward that direction. Then the motion detection algorithm will get the animal into middle of the frame and capture by high end auto focus web cam. Then the captured images send to the PC and are compared with photograph database to check whether the animal is exactly the same as the photographer choice. If that captured animal is the exactly one who need to capture then it will automatically capture more. Though there are several technologies available none of these are capable of recognizing what it captures. There is no detection of animal presence in different angles. Most of available equipment uses a set of PIR sensors and whatever it disturbs the IR field will automatically be captured and stored. Night time images are black and white and have less details and clarity due to infrared flash quality. If the infrared flash is designed for best image quality, range will be sacrificed. The photographer might be interested in a specific animal but there is no facility to recognize automatically whether captured animal is the photographer’s choice or not.
基金funded by the Pyramid Talent Training Project of Beijing University of Civil Engineering and Architecture under Grant GJZJ20220802。
文摘Accurately estimating the State of Health(SOH)and Remaining Useful Life(RUL)of lithium-ion batteries(LIBs)is crucial for the continuous and stable operation of battery management systems.However,due to the complex internal chemical systems of LIBs and the nonlinear degradation of their performance,direct measurement of SOH and RUL is challenging.To address these issues,the Twin Support Vector Machine(TWSVM)method is proposed to predict SOH and RUL.Initially,the constant current charging time of the lithium battery is extracted as a health indicator(HI),decomposed using Variational Modal Decomposition(VMD),and feature correlations are computed using Importance of Random Forest Features(RF)to maximize the extraction of critical factors influencing battery performance degradation.Furthermore,to enhance the global search capability of the Convolution Optimization Algorithm(COA),improvements are made using Good Point Set theory and the Differential Evolution method.The Improved Convolution Optimization Algorithm(ICOA)is employed to optimize TWSVM parameters for constructing SOH and RUL prediction models.Finally,the proposed models are validated using NASA and CALCE lithium-ion battery datasets.Experimental results demonstrate that the proposed models achieve an RMSE not exceeding 0.007 and an MAPE not exceeding 0.0082 for SOH and RUL prediction,with a relative error in RUL prediction within the range of[-1.8%,2%].Compared to other models,the proposed model not only exhibits superior fitting capability but also demonstrates robust performance.
基金Supported by the National Natural Science Foundation of China (No.41001285)
文摘Geospatial objects detection within complex environment is a challenging problem in remote sensing area. In this paper, we derive an extension of the Relevance Vector Machine (RVM) technique to multiple kernel version. The proposed method learns an optimal kernel combination and the associated classifier simultaneously. Two feature types are extracted from images, forming basis kernels. Then these basis kernels are weighted combined and resulted the composite kernel exploits interesting points and appearance information of objects simultaneously. Weights and the detection model are finally learnt by a new algorithm. Experimental results show that the proposed method improve detection accuracy to above 88%, yields good interpretation for the selected subset of features and appears sparser than traditional single-kernel RVMs.
基金supported by National Natural Science Foundation of China(Grant Nos.11271361 and 70921061)the CAS/SAFEA International Partnership Program for Creative Research Teams,Major International(Regional)Joint Research Project(Grant No.71110107026)+1 种基金the Ministry of Water Resources Special Funds for Scientific Research on Public Causes(Grant No.201301094)Hong Kong Polytechnic University(Grant No.B-Q10D)
文摘We improve the twin support vector machine(TWSVM)to be a novel nonparallel hyperplanes classifier,termed as ITSVM(improved twin support vector machine),for binary classification.By introducing the diferent Lagrangian functions for the primal problems in the TWSVM,we get an improved dual formulation of TWSVM,then the resulted ITSVM algorithm overcomes the common drawbacks in the TWSVMs and inherits the essence of the standard SVMs.Firstly,ITSVM does not need to compute the large inverse matrices before training which is inevitable for the TWSVMs.Secondly,diferent from the TWSVMs,kernel trick can be applied directly to ITSVM for the nonlinear case,therefore nonlinear ITSVM is superior to nonlinear TWSVM theoretically.Thirdly,ITSVM can be solved efciently by the successive overrelaxation(SOR)technique or sequential minimization optimization(SMO)method,which makes it more suitable for large scale problems.We also prove that the standard SVM is the special case of ITSVM.Experimental results show the efciency of our method in both computation time and classification accuracy.
基金the National Natural Science Foundation of China(Nos.61202082 and 61003285)the Fundamental Research Funds for the Central Universities of China(Nos.BUPT2012RC0219 and BUPT2012RC0218)
文摘Intrusion detection system(IDS) is becoming a critical component of network security. However,the performance of many proposed intelligent intrusion detection models is still not competent to be applied to real network security. This paper aims to explore a novel and effective approach to significantly improve the performance of IDS. An intrusion detection model with twin support vector machines(TWSVMs) is proposed.In this model, an efficient algorithm is also proposed to determine the parameter of TWSVMs. The performance of the proposed intrusion detection model is evaluated with KDD'99 dataset and is compared with those of some recent intrusion detection models. The results demonstrate that the proposed intrusion detection model achieves remarkable improvement in intrusion detection rate and more balanced performance on each type of attacks.Moreover, TWSVMs consume much less training time than standard support vector machines(SVMs).
基金This research was supported by the National Natural Science Foundation of China(No.11771275).
文摘In this paper,a new quadratic kernel-free least square twin support vector machine(QLSTSVM)is proposed for binary classification problems.The advantage of QLSTSVM is that there is no need to select the kernel function and related parameters for nonlinear classification problems.After using consensus technique,we adopt alternating direction method of multipliers to solve the reformulated consensus QLSTSVM directly.To reduce CPU time,the Karush-Kuhn-Tucker(KKT)conditions is also used to solve the QLSTSVM.The performance of QLSTSVM is tested on two artificial datasets and several University of California Irvine(UCI)benchmark datasets.Numerical results indicate that the QLSTSVM may outperform several existing methods for solving twin support vector machine with Gaussian kernel in terms of the classification accuracy and operation time.
基金supported by the National Natural Science Foundation of China(6177202062202433+4 种基金621723716227242262036010)the Natural Science Foundation of Henan Province(22100002)the Postdoctoral Research Grant in Henan Province(202103111)。
文摘Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.
基金supported in part by the National Natural Science Foundation of China(51875457)Natural Science Foundation of Shaanxi Province of China(2021JQ-701)+1 种基金the Key Research Project of Shaanxi Province(2022GY-050,2022GY-028)Xi’an Science and Technology Plan Project(2020KJRC0109)。
文摘For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic programming problems(QPPs),which makes LSTSVM much faster than the original TSVM.But the standard LSTSVM adopting quadratic loss measured by the minimal distance is sensitive to noise and unstable to re-sampling.To overcome this problem,the expectile distance is taken into consideration to measure the margin between classes and LSTSVM with asymmetric squared loss(aLSTSVM)is proposed.Compared to the original LSTSVM with the quadratic loss,the proposed aLSTSVM not only has comparable computational accuracy,but also performs good properties such as noise insensitivity,scatter minimization and re-sampling stability.Numerical experiments on synthetic datasets,normally distributed clustered(NDC)datasets and University of California,Irvine(UCI)datasets with different noises confirm the great performance and validity of our proposed algorithm.
基金This work was supported by the National Natural Science Foundation of China(No.11771275)The second author thanks the partially support of Dutch Research Council(No.040.11.724).
文摘In general,data contain noises which come from faulty instruments,flawed measurements or faulty communication.Learning with data in the context of classification or regression is inevitably affected by noises in the data.In order to remove or greatly reduce the impact of noises,we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine(Lap-TSVM).A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine(IFLap-TSVM)is presented.Moreover,we extend the linear IFLap-TSVM to the nonlinear case by kernel function.The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classi-fier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization.Experiments with constructed artificial datasets,several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine(TSVM),intuitionistic fuzzy twin support vector machine(IFTSVM)and Lap-TSVM.
基金supported in part by the National Natural Science Foundation of China (51875457)Natural Science Foundation of Shaanxi Province of China (2021JQ-701)Xi’an Science and Technology Plan Project (2020KJRC0109)。
文摘Robust minimum class variance twin support vector machine(RMCV-TWSVM) presented previously gets better classification performance than the classical TWSVM. The RMCV-TWSVM introduces the class variance matrix of positive and negative samples into the construction of two hyperplanes. However, it does not consider the total structure information of all the samples, which can substantially reduce its classification accuracy. In this paper, a new algorithm named structural regularized TWSVM based on within-class scatter and between-class scatter(WSBS-STWSVM) is put forward. The WSBS-STWSVM can make full use of the total within-class distribution information and between-class structure information of all the samples. The experimental results illustrate high classification accuracy and strong generalization ability of the proposed algorithm.
基金the National Natural Science Foundation of China (No. 60572157)the Scientific Research Foundation for the Returned Overseas Chinese Schol-ars, State Education Ministry
文摘Classification of intrusion attacks and normal network flow is a critical and challenging issue in network security study. Many intelligent intrusion detection models are proposed, but their performances and efficiencies are not satisfied to real computer networks. This paper presents a novel effective intrusion detection system based on statistic reference model and twin support vector machines (TWSVMs). Moreover, a network flow feature selection procedure has been studied and implemented with TWSVMs. The performances of proposed system are evaluated through using the fifth international conference on knowledge discovery and data mining in 1999 (KDD'99) data set collected at MIT's Lincoln Labs and the results indicate that the proposed system is more efficient and effective than conventional support vector machines (SVMs) and TWSVMs.
文摘[目的]在滑坡易发性评价中,滑坡预测模型的选取和优化对运算过程的高效性和预测结果的准确性至关重要。针对现有单目标遗传优化算法(genetic algorithm,GA)易陷入早熟、局部搜索能力差、全局优化速度慢等问题,拟提出一种新的优化算法框架,将多目标遗传算法中的经典算法—带精英选择策略的非支配排序算法(the nondominated sorting genetic algorithm with an elite strategy,NSGA-Ⅱ)与常用机器学习模型[随机森林(random forest,RF)、支持向量机(support vector machine,SVM)]相结合,进行滑坡易发性预测。与单目标优化不同的是,NSGA-Ⅱ算法可同时进行特征选择和超参数优化,并使预测模型同时实现最优准确度、召回率、精密度和AUC(area under curve,AUC)。[方法]以三峡库区重庆段为研究区,从模型精度评价、滑坡灾害易发性分区图、分区统计3个方面对4种优化模型(RF-GA、SVM-GA、RF-NSGA-II、SVM-NSGA-II)进行对比分析。[结果]NSGA-II较GA优化效果更明显,在模型评价和滑坡易发性分区方面,RF-NSGA-II模型具有更高的预测性能,4项评价值分别为80.91%,81.89%,80.07%,88.60%,证明NSGA-II优化算法的有效性;极低至极高危险区面积占比依次为23.06%,22.46%,22.96%,19.99%,11.53%,验证了RF-NSGA-II模型的可靠性。由RF-NSGA-II模型预测得到的易发性图表明,高和极高易发性区集中在研究区北部,且由东向西呈带状分布。[结论]研究采取的基于多目标选择的RF-NSGA-II模型,为滑坡易发性评价中机器学习模型调优提供新思路。