The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
Coal mines require various kinds of machinery. The fault diagnosis of this equipment has a great impact on mine production. The problem of incorrect classification of noisy data by traditional support vector machines ...Coal mines require various kinds of machinery. The fault diagnosis of this equipment has a great impact on mine production. The problem of incorrect classification of noisy data by traditional support vector machines is addressed by a proposed Probability Least Squares Support Vector Classification Machine (PLSSVCM). Samples that cannot be definitely determined as belonging to one class will be assigned to a class by the PLSSVCM based on a probability value. This gives the classification results both a qualitative explanation and a quantitative evaluation. Simulation results of a fault diagnosis show that the correct rate of the PLSSVCM is 100%. Even though samples are noisy, the PLSSVCM still can effectively realize multi-class fault diagnosis of a roller bearing. The generalization property of the PLSSVCM is better than that of a neural network and a LSSVCM.展开更多
In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabe...In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabeled samples. In S2TSVM, the addition of unlabeled samples can easily cause the classification hyper plane to deviate from the sample points. Then a centerdistance principle is proposed to pre-classify unlabeled samples, and a pre-classified S2TSVM (PS2TSVM) is proposed. Compared with S2TSVM, PS2TSVM not only improves the problem of the samples deviating from the classification hyper plane, but also improves the training speed. Then PS2TSVM is smoothed. After smoothing the model, the pre-classified smooth S2TSVM (PS3TSVM) is obtained, and its convergence is deduced. Finally, nine datasets are selected in the UCI machine learning database for comparison with other types of semi-supervised models. The experimental results show that the proposed PS3TSVM model has better classification results.展开更多
With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly af...With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly affect the performance of the entire network. Traditional processing methods include classification models such as fully connected network models and support vector machines. In order to solve the problem that the traditional convolutional neural network is prone to over-fitting for the classification of small samples, a CNN-TWSVM hybrid model was proposed by fusing the twin support vector machine (TWSVM) with higher computational efficiency as the CNN classifier, and it was applied to the traffic sign recognition task. In order to improve the generalization ability of the model, the wavelet kernel function is introduced to deal with the nonlinear classification task. The method uses the network initialized from the ImageNet dataset to fine-tune the specific domain and intercept the inner layer of the network to extract the high abstract features of the traffic sign image. Finally, the TWSVM based on wavelet kernel function is used to identify the traffic signs, so as to effectively solve the over-fitting problem of traffic signs classification. On GTSRB and BELGIUMTS datasets, the validity and generalization ability of the improved model is verified by comparing with different kernel functions and different SVM classifiers.展开更多
Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonab...Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.展开更多
SVM handles classification problem only considering samples themselves and the classification effect depends on the characteristics of the training samples but not the current information of classified problem.From th...SVM handles classification problem only considering samples themselves and the classification effect depends on the characteristics of the training samples but not the current information of classified problem.From the phenomena of data crossing in systems,this paper improves the classification effect of SVM by adding the prior probability item reflecting the classified problem information into the decision function,which fuses the Bayesian criterion into SVM.The detailed deducing process and realizing steps of the algorithm are put forward.It is verified through two instances.The results showthat the new algorithm has better effect than the traditional SVM algorithm,and its robustness and sensitivity are all improved.展开更多
Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsi...Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.展开更多
For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic pro...For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic programming problems(QPPs),which makes LSTSVM much faster than the original TSVM.But the standard LSTSVM adopting quadratic loss measured by the minimal distance is sensitive to noise and unstable to re-sampling.To overcome this problem,the expectile distance is taken into consideration to measure the margin between classes and LSTSVM with asymmetric squared loss(aLSTSVM)is proposed.Compared to the original LSTSVM with the quadratic loss,the proposed aLSTSVM not only has comparable computational accuracy,but also performs good properties such as noise insensitivity,scatter minimization and re-sampling stability.Numerical experiments on synthetic datasets,normally distributed clustered(NDC)datasets and University of California,Irvine(UCI)datasets with different noises confirm the great performance and validity of our proposed algorithm.展开更多
Classification of intrusion attacks and normal network flow is a critical and challenging issue in network security study. Many intelligent intrusion detection models are proposed, but their performances and efficienc...Classification of intrusion attacks and normal network flow is a critical and challenging issue in network security study. Many intelligent intrusion detection models are proposed, but their performances and efficiencies are not satisfied to real computer networks. This paper presents a novel effective intrusion detection system based on statistic reference model and twin support vector machines (TWSVMs). Moreover, a network flow feature selection procedure has been studied and implemented with TWSVMs. The performances of proposed system are evaluated through using the fifth international conference on knowledge discovery and data mining in 1999 (KDD'99) data set collected at MIT's Lincoln Labs and the results indicate that the proposed system is more efficient and effective than conventional support vector machines (SVMs) and TWSVMs.展开更多
为进一步提高有载分接开关(on-load tap changer,OLTC)机械状态监测的准确性,文中基于优化品质因数可调小波变换(tunable quality wavelet transform,TQWT)对OLTC切换过程中的振动信号进行了分析。即使用人工鱼群算法(artificial fish s...为进一步提高有载分接开关(on-load tap changer,OLTC)机械状态监测的准确性,文中基于优化品质因数可调小波变换(tunable quality wavelet transform,TQWT)对OLTC切换过程中的振动信号进行了分析。即使用人工鱼群算法(artificial fish swarm algorithm,AFSA)基于分解余量与整体正交系数研究了TQWT的优化分解方法,计算得到了OLTC振动信号的多个子序列,构建了基于优化孪生支持向量机(twin support vector machine,TWSVM)的OLTC机械故障诊断模型。对某CM型OLTC正常与典型机械故障下振动信号的分析结果表明,所提优化TQWT分解方法有效提高了OLTC振动信号分解结果的准确性。相对于其他诊断模型,所构建AFSA-TWSVM的OLTC机械故障诊断模型分类效果好且收敛速度更快。展开更多
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.
基金supported by the Program for New Century Excellent Talents in University (NoNCET- 08-0836)the National Natural Science Foundation of China (Nos60804022, 60974050 and 61072094)+1 种基金the Fok Ying-Tung Education Foundation for Young Teachers (No121066)by the Natural Science Foundation of Jiangsu Province (No.BK2008126)
文摘Coal mines require various kinds of machinery. The fault diagnosis of this equipment has a great impact on mine production. The problem of incorrect classification of noisy data by traditional support vector machines is addressed by a proposed Probability Least Squares Support Vector Classification Machine (PLSSVCM). Samples that cannot be definitely determined as belonging to one class will be assigned to a class by the PLSSVCM based on a probability value. This gives the classification results both a qualitative explanation and a quantitative evaluation. Simulation results of a fault diagnosis show that the correct rate of the PLSSVCM is 100%. Even though samples are noisy, the PLSSVCM still can effectively realize multi-class fault diagnosis of a roller bearing. The generalization property of the PLSSVCM is better than that of a neural network and a LSSVCM.
基金supported by the Fundamental Research Funds for University of Science and Technology Beijing(FRF-BR-12-021)
文摘In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabeled samples. In S2TSVM, the addition of unlabeled samples can easily cause the classification hyper plane to deviate from the sample points. Then a centerdistance principle is proposed to pre-classify unlabeled samples, and a pre-classified S2TSVM (PS2TSVM) is proposed. Compared with S2TSVM, PS2TSVM not only improves the problem of the samples deviating from the classification hyper plane, but also improves the training speed. Then PS2TSVM is smoothed. After smoothing the model, the pre-classified smooth S2TSVM (PS3TSVM) is obtained, and its convergence is deduced. Finally, nine datasets are selected in the UCI machine learning database for comparison with other types of semi-supervised models. The experimental results show that the proposed PS3TSVM model has better classification results.
文摘With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly affect the performance of the entire network. Traditional processing methods include classification models such as fully connected network models and support vector machines. In order to solve the problem that the traditional convolutional neural network is prone to over-fitting for the classification of small samples, a CNN-TWSVM hybrid model was proposed by fusing the twin support vector machine (TWSVM) with higher computational efficiency as the CNN classifier, and it was applied to the traffic sign recognition task. In order to improve the generalization ability of the model, the wavelet kernel function is introduced to deal with the nonlinear classification task. The method uses the network initialized from the ImageNet dataset to fine-tune the specific domain and intercept the inner layer of the network to extract the high abstract features of the traffic sign image. Finally, the TWSVM based on wavelet kernel function is used to identify the traffic signs, so as to effectively solve the over-fitting problem of traffic signs classification. On GTSRB and BELGIUMTS datasets, the validity and generalization ability of the improved model is verified by comparing with different kernel functions and different SVM classifiers.
基金Supported by the National Natural Science Foundation of China (No. 60771068)the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2007F248)
文摘Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.
文摘SVM handles classification problem only considering samples themselves and the classification effect depends on the characteristics of the training samples but not the current information of classified problem.From the phenomena of data crossing in systems,this paper improves the classification effect of SVM by adding the prior probability item reflecting the classified problem information into the decision function,which fuses the Bayesian criterion into SVM.The detailed deducing process and realizing steps of the algorithm are put forward.It is verified through two instances.The results showthat the new algorithm has better effect than the traditional SVM algorithm,and its robustness and sensitivity are all improved.
基金supported by the National Natural Science Foundation of China(6177202062202433+4 种基金621723716227242262036010)the Natural Science Foundation of Henan Province(22100002)the Postdoctoral Research Grant in Henan Province(202103111)。
文摘Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.
基金supported in part by the National Natural Science Foundation of China(51875457)Natural Science Foundation of Shaanxi Province of China(2021JQ-701)+1 种基金the Key Research Project of Shaanxi Province(2022GY-050,2022GY-028)Xi’an Science and Technology Plan Project(2020KJRC0109)。
文摘For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic programming problems(QPPs),which makes LSTSVM much faster than the original TSVM.But the standard LSTSVM adopting quadratic loss measured by the minimal distance is sensitive to noise and unstable to re-sampling.To overcome this problem,the expectile distance is taken into consideration to measure the margin between classes and LSTSVM with asymmetric squared loss(aLSTSVM)is proposed.Compared to the original LSTSVM with the quadratic loss,the proposed aLSTSVM not only has comparable computational accuracy,but also performs good properties such as noise insensitivity,scatter minimization and re-sampling stability.Numerical experiments on synthetic datasets,normally distributed clustered(NDC)datasets and University of California,Irvine(UCI)datasets with different noises confirm the great performance and validity of our proposed algorithm.
基金the National Natural Science Foundation of China (No. 60572157)the Scientific Research Foundation for the Returned Overseas Chinese Schol-ars, State Education Ministry
文摘Classification of intrusion attacks and normal network flow is a critical and challenging issue in network security study. Many intelligent intrusion detection models are proposed, but their performances and efficiencies are not satisfied to real computer networks. This paper presents a novel effective intrusion detection system based on statistic reference model and twin support vector machines (TWSVMs). Moreover, a network flow feature selection procedure has been studied and implemented with TWSVMs. The performances of proposed system are evaluated through using the fifth international conference on knowledge discovery and data mining in 1999 (KDD'99) data set collected at MIT's Lincoln Labs and the results indicate that the proposed system is more efficient and effective than conventional support vector machines (SVMs) and TWSVMs.
文摘为进一步提高有载分接开关(on-load tap changer,OLTC)机械状态监测的准确性,文中基于优化品质因数可调小波变换(tunable quality wavelet transform,TQWT)对OLTC切换过程中的振动信号进行了分析。即使用人工鱼群算法(artificial fish swarm algorithm,AFSA)基于分解余量与整体正交系数研究了TQWT的优化分解方法,计算得到了OLTC振动信号的多个子序列,构建了基于优化孪生支持向量机(twin support vector machine,TWSVM)的OLTC机械故障诊断模型。对某CM型OLTC正常与典型机械故障下振动信号的分析结果表明,所提优化TQWT分解方法有效提高了OLTC振动信号分解结果的准确性。相对于其他诊断模型,所构建AFSA-TWSVM的OLTC机械故障诊断模型分类效果好且收敛速度更快。