The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabe...In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabeled samples. In S2TSVM, the addition of unlabeled samples can easily cause the classification hyper plane to deviate from the sample points. Then a centerdistance principle is proposed to pre-classify unlabeled samples, and a pre-classified S2TSVM (PS2TSVM) is proposed. Compared with S2TSVM, PS2TSVM not only improves the problem of the samples deviating from the classification hyper plane, but also improves the training speed. Then PS2TSVM is smoothed. After smoothing the model, the pre-classified smooth S2TSVM (PS3TSVM) is obtained, and its convergence is deduced. Finally, nine datasets are selected in the UCI machine learning database for comparison with other types of semi-supervised models. The experimental results show that the proposed PS3TSVM model has better classification results.展开更多
With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly af...With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly affect the performance of the entire network. Traditional processing methods include classification models such as fully connected network models and support vector machines. In order to solve the problem that the traditional convolutional neural network is prone to over-fitting for the classification of small samples, a CNN-TWSVM hybrid model was proposed by fusing the twin support vector machine (TWSVM) with higher computational efficiency as the CNN classifier, and it was applied to the traffic sign recognition task. In order to improve the generalization ability of the model, the wavelet kernel function is introduced to deal with the nonlinear classification task. The method uses the network initialized from the ImageNet dataset to fine-tune the specific domain and intercept the inner layer of the network to extract the high abstract features of the traffic sign image. Finally, the TWSVM based on wavelet kernel function is used to identify the traffic signs, so as to effectively solve the over-fitting problem of traffic signs classification. On GTSRB and BELGIUMTS datasets, the validity and generalization ability of the improved model is verified by comparing with different kernel functions and different SVM classifiers.展开更多
Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonab...Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.展开更多
Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsi...Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.展开更多
For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic pro...For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic programming problems(QPPs),which makes LSTSVM much faster than the original TSVM.But the standard LSTSVM adopting quadratic loss measured by the minimal distance is sensitive to noise and unstable to re-sampling.To overcome this problem,the expectile distance is taken into consideration to measure the margin between classes and LSTSVM with asymmetric squared loss(aLSTSVM)is proposed.Compared to the original LSTSVM with the quadratic loss,the proposed aLSTSVM not only has comparable computational accuracy,but also performs good properties such as noise insensitivity,scatter minimization and re-sampling stability.Numerical experiments on synthetic datasets,normally distributed clustered(NDC)datasets and University of California,Irvine(UCI)datasets with different noises confirm the great performance and validity of our proposed algorithm.展开更多
对支持向量机(Twin Support Vector Machine,TWSVM)的优化思想源于基于广义特征值近似支持向量机(ProximalSVM based on Generalized Eigenvalues,GEPSVM)。该算法将传统SVM问题分解为两个凸规划问题,使得训练速度缩减到原来的1/4。对TW...对支持向量机(Twin Support Vector Machine,TWSVM)的优化思想源于基于广义特征值近似支持向量机(ProximalSVM based on Generalized Eigenvalues,GEPSVM)。该算法将传统SVM问题分解为两个凸规划问题,使得训练速度缩减到原来的1/4。对TWSVM做了修正,基于新的优化准则设计了一种特殊TWSVM(GTWSVM),在此基础上,提出了快速GTWSVM(FGTWSVM),其将GTWSVM转换为无约束凸规划问题求解。该算法在保证得到与TWSVM相当的分类性能以及较快的计算速度的同时,还减少了输入空间的特征数以及内存占用。对于非线性问题,FGTWSVM可以减少核函数数目。展开更多
为进一步提高有载分接开关(on-load tap changer,OLTC)机械状态监测的准确性,文中基于优化品质因数可调小波变换(tunable quality wavelet transform,TQWT)对OLTC切换过程中的振动信号进行了分析。即使用人工鱼群算法(artificial fish s...为进一步提高有载分接开关(on-load tap changer,OLTC)机械状态监测的准确性,文中基于优化品质因数可调小波变换(tunable quality wavelet transform,TQWT)对OLTC切换过程中的振动信号进行了分析。即使用人工鱼群算法(artificial fish swarm algorithm,AFSA)基于分解余量与整体正交系数研究了TQWT的优化分解方法,计算得到了OLTC振动信号的多个子序列,构建了基于优化孪生支持向量机(twin support vector machine,TWSVM)的OLTC机械故障诊断模型。对某CM型OLTC正常与典型机械故障下振动信号的分析结果表明,所提优化TQWT分解方法有效提高了OLTC振动信号分解结果的准确性。相对于其他诊断模型,所构建AFSA-TWSVM的OLTC机械故障诊断模型分类效果好且收敛速度更快。展开更多
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.
基金supported by the Fundamental Research Funds for University of Science and Technology Beijing(FRF-BR-12-021)
文摘In order to handle the semi-supervised problem quickly and efficiently in the twin support vector machine (TWSVM) field, a semi-supervised twin support vector machine (S2TSVM) is proposed by adding the original unlabeled samples. In S2TSVM, the addition of unlabeled samples can easily cause the classification hyper plane to deviate from the sample points. Then a centerdistance principle is proposed to pre-classify unlabeled samples, and a pre-classified S2TSVM (PS2TSVM) is proposed. Compared with S2TSVM, PS2TSVM not only improves the problem of the samples deviating from the classification hyper plane, but also improves the training speed. Then PS2TSVM is smoothed. After smoothing the model, the pre-classified smooth S2TSVM (PS3TSVM) is obtained, and its convergence is deduced. Finally, nine datasets are selected in the UCI machine learning database for comparison with other types of semi-supervised models. The experimental results show that the proposed PS3TSVM model has better classification results.
文摘With the progress of deep learning research, convolutional neural networks have become the most important method in feature extraction. How to effectively classify and recognize the extracted features will directly affect the performance of the entire network. Traditional processing methods include classification models such as fully connected network models and support vector machines. In order to solve the problem that the traditional convolutional neural network is prone to over-fitting for the classification of small samples, a CNN-TWSVM hybrid model was proposed by fusing the twin support vector machine (TWSVM) with higher computational efficiency as the CNN classifier, and it was applied to the traffic sign recognition task. In order to improve the generalization ability of the model, the wavelet kernel function is introduced to deal with the nonlinear classification task. The method uses the network initialized from the ImageNet dataset to fine-tune the specific domain and intercept the inner layer of the network to extract the high abstract features of the traffic sign image. Finally, the TWSVM based on wavelet kernel function is used to identify the traffic signs, so as to effectively solve the over-fitting problem of traffic signs classification. On GTSRB and BELGIUMTS datasets, the validity and generalization ability of the improved model is verified by comparing with different kernel functions and different SVM classifiers.
基金Supported by the National Natural Science Foundation of China (No. 60771068)the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2007F248)
文摘Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.
基金supported by the National Natural Science Foundation of China(6177202062202433+4 种基金621723716227242262036010)the Natural Science Foundation of Henan Province(22100002)the Postdoctoral Research Grant in Henan Province(202103111)。
文摘Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.
基金supported in part by the National Natural Science Foundation of China(51875457)Natural Science Foundation of Shaanxi Province of China(2021JQ-701)+1 种基金the Key Research Project of Shaanxi Province(2022GY-050,2022GY-028)Xi’an Science and Technology Plan Project(2020KJRC0109)。
文摘For classification problems,the traditional least squares twin support vector machine(LSTSVM)generates two nonparallel hyperplanes directly by solving two systems of linear equations instead of a pair of quadratic programming problems(QPPs),which makes LSTSVM much faster than the original TSVM.But the standard LSTSVM adopting quadratic loss measured by the minimal distance is sensitive to noise and unstable to re-sampling.To overcome this problem,the expectile distance is taken into consideration to measure the margin between classes and LSTSVM with asymmetric squared loss(aLSTSVM)is proposed.Compared to the original LSTSVM with the quadratic loss,the proposed aLSTSVM not only has comparable computational accuracy,but also performs good properties such as noise insensitivity,scatter minimization and re-sampling stability.Numerical experiments on synthetic datasets,normally distributed clustered(NDC)datasets and University of California,Irvine(UCI)datasets with different noises confirm the great performance and validity of our proposed algorithm.
文摘对支持向量机(Twin Support Vector Machine,TWSVM)的优化思想源于基于广义特征值近似支持向量机(ProximalSVM based on Generalized Eigenvalues,GEPSVM)。该算法将传统SVM问题分解为两个凸规划问题,使得训练速度缩减到原来的1/4。对TWSVM做了修正,基于新的优化准则设计了一种特殊TWSVM(GTWSVM),在此基础上,提出了快速GTWSVM(FGTWSVM),其将GTWSVM转换为无约束凸规划问题求解。该算法在保证得到与TWSVM相当的分类性能以及较快的计算速度的同时,还减少了输入空间的特征数以及内存占用。对于非线性问题,FGTWSVM可以减少核函数数目。
文摘为进一步提高有载分接开关(on-load tap changer,OLTC)机械状态监测的准确性,文中基于优化品质因数可调小波变换(tunable quality wavelet transform,TQWT)对OLTC切换过程中的振动信号进行了分析。即使用人工鱼群算法(artificial fish swarm algorithm,AFSA)基于分解余量与整体正交系数研究了TQWT的优化分解方法,计算得到了OLTC振动信号的多个子序列,构建了基于优化孪生支持向量机(twin support vector machine,TWSVM)的OLTC机械故障诊断模型。对某CM型OLTC正常与典型机械故障下振动信号的分析结果表明,所提优化TQWT分解方法有效提高了OLTC振动信号分解结果的准确性。相对于其他诊断模型,所构建AFSA-TWSVM的OLTC机械故障诊断模型分类效果好且收敛速度更快。