摘要
孪生支持向量机(TWSVM)目前已在众多领域取得了成功的应用,但标准TWSVM模型在处理具有分布特征的数据分类问题时鲁棒性差,尤其当数据的不确定性程度较大时,不考虑样本点分布特征的标准分类模型已不能满足分类准确率的要求。为此,文中提出了基于数据分布特征的加权线性孪生支持向量机(TWSVM-U)模型,它在TWSVM的基础上考虑数据的分布特征对分类超平面位置的影响,根据数据在分类超平面法方向的分散程度定量构造距离权重。事实上,TWSVM-U是TWSVM的推广,当训练样本数据不具有分布特征时,TWSVM-U模型将退化为标准TWSVM模型。十折交叉验证的实验结果表明,TWSVM-U模型在处理波动范围较大的不确定性数据分类问题时比SVM和TWSVM表现更优。
Twin Support Vector Machine(TWSVM)have been successfully applied in many fields.However,the standard TWSVM model have poor robustness when dealing with data classification problems involving distribution characteristics,especially when uncertainty in data fluctuates wildly,the standard classification model,which doesn’t consider the distribution characteristics,is no longer satisfactory for classification accuracy.Therefore,a weighted linear twin support vector machine model based on data distribution characteristics was proposed in this paper.The new model,denoted by TWSVM-U,further considers the influence of data distribution characteristics on the locations of classification hyperplanes,and constructs distance weights quantitatively according to data dispersity at the normal vector directions of classification hyperplanes.TWSVM-U is a generalization of TWSVM.In fact,when training samples do not have distribution characteristics,TWSVM-U model will degenerate to the standard TWSVM model.Experiments with 10-fold cross validation show that the TWSVM-U model performs better than the SVM and the TWSVM on classification problems with large data fluctuation range.
作者
宋瑞阳
孟华
龙治国
SONG Rui-yang;MENG Hua;LONG Zhi-guo(School of Mathematics,Southwest Jiaotong University,Chengdu 611756,China;School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,China)
出处
《计算机科学》
CSCD
北大核心
2019年第B06期407-411,共5页
Computer Science
基金
NSFC(61773324)
教育部人文社科项目(18XJC72040001)
中央高校基本科研业务费专项资金(2682016CX114,2682018CX25)资助
关键词
二分类
孪生支持向量机
不确定信息
加权距离
Binary classification
Twin support vector machine
Uncertain information
Weighted distance