摘要
支持向量机分类算法引入惩罚因子来调节过拟合和线性不可分时无解的问题,优点是可以通过调节参数取得最优解,但带来的问题是允许一部分样本错分。错分的样本在分类间隔之间失去了约束,导致两类交界处样本杂乱分布,并且增加了训练的负担。为了解决上述问题,该文根据大间隔分类思想,基于类内紧密类间松散的原则,提出一种新的分类算法,称之为最小化类内距离和(Intraclass-Distance-Sum-Minimization,IDSM)分类算法。该算法根据最小化类内距离和准则构造训练模型,通过解析法求解得到最佳的映射法则,进而利用该最佳映射法则对样本进行投影变换以达到类内间隔小类间间隔大的效果。相应地,为解决高维样本分类问题,进一步提出了该文算法的核化版本。在大量UCI数据集和Yale大学人脸数据库上的实验结果表明了该文算法的优越性。
Classification algorithm of Support Vector Machine(SVM) is introduced the penalty factor to adjust the overfit and nonlinear problem. The method is beneficial for seeking the optimal solution by allowing a part of samples error classified. But it also causes a problem that error classified samples distribute disorderedly and increase the burden of training. In order to solve the above problems, according to large margin classification thought, based on principles that the intraclass samples must be closer and the interclass samples must be looser, this research proposes a new classification algorithm called Intraclass-Distance-Sum-Minimization(IDSM) based classification algorithm. This algorithm constructs a training model by using principle of minimizing the sum of the intraclass distance and finds the optimal projection rule by analytical method. And then the optimal projection rule is used to samples' projection transformation to achieve the effect that intraclass intervals are small and the interclass intervals are large. Accordingly, this research offers a kernel version of the algorithm to solve high-dimensional classification problems. Experiment results on a large number of UCI datasets and the Yale face database indicate the superiority of the proposed algorithm.
出处
《电子与信息学报》
EI
CSCD
北大核心
2016年第3期532-540,共9页
Journal of Electronics & Information Technology
基金
国家自然科学基金(61170122
61272210)~~
关键词
支持向量机
惩罚因子
大间隔分类思想
类内距离和
映射法则
Support Vector Machine(SVM)
Penalty factor
Large margin classification thought
Sum of intraclass distance
Projection rule