摘要
分类算法主要存在问题:(1)无法充分利用样本的分布特征;(2)无法保持样本的相对关系不变;(3)无法解决大规模分类问题。对此,提出了一种基于最大散度差的保序分类算法RPCM,该方法利用线性判别分析算法中的类间离散度和类内离散度来表征样本的分布特征,通过保持各类样本中心相对关系不变来实现样本相对关系不变。理论分析表明:RPCM的对偶形式与最小包含球等价。在核心向量机的基础上提出了RPCM-CVM算法,该算法可用来解决大规模分类问题,标准数据集上的比较实验验证了所提方法的有效性。
There exist several problems in the traditional classifiers : ( 1 ) they cant fully utilize the distribution feature of training data ; (2) they cant preserve the rank relations between different classes; (3) most of them cant deal with the large-scale classification prob- lem. In order to solve the above problems, a rank-preserving classification method based on maximum scatter difference ( RPCM ) is pro- posed in this paper. The between-class scatter and the within-class scatter in linear discriminant analysis (LDA) are introduced to de- scribe the distribution feature and the rank relations between different classes can be preserved by keeping the average values of differ- ent classes invariant. It can be proved that the dual form of RPCM is equivalent to the minimal enclosing ball (MEB) by theoretical a- nalysis and the RPCM-CVM algorithm is proposed based on core vector machine ( CVM ), which can be used to solve the large-scale classification problem. The experiments on several standard datasets verify the effectiveness of the proposed RPCM and RPCM-CVM methods.
出处
《西安石油大学学报(自然科学版)》
CAS
北大核心
2017年第4期123-126,共4页
Journal of Xi’an Shiyou University(Natural Science Edition)
基金
国家自然科学基金项目(编号:61202311)
山西自然科学基金项目(编号:201601D011042)
关键词
最大散度差
保序分类
类间离散度
类内离散度
maximum scatter difference
rank preserving classification
between-class scatter
within-class scatter