摘要
在有监督学习模式下,当样本数量在类与类之间的分布具有较大的不平衡现象时,一些传统算法如LDA的性能会受到很大影响.在iid条件下,可以认为每类数据具有独特性,且类与类之间彼此独立.基于此,提出了同质球形邻域算法IHSN(Isotropic Hyper Sphere Neighborhood).通过在Rn-1空间中构建n个同质的正则单纯形,作为样本在嵌入空间中的同质球形邻域,利用带约束的最小二乘回归法可求得数据空间与嵌入空间的映射函数.所提出的IHSN算法有两种实现形式:基于流形学习的IHSN-ML、基于KL散度的IHSN-KL.IHSN-ML具有闭式解,速度快;IHSN-KL可解释性好,精度更高.在IRIS和PIE-CMU数据集上的实验,验证了所提算法的有效性.
In supervised learning,the performance of LDA would degrade dramatically when the samples between classes are seriously unbalanced.The samples of each class are unique and independent under the assumption of iid conditions.In this paper,IHSN(Isotropic Hyper Sphere Neighborhood)is proposed to tackle those problems.In Rn-1 feature space,nregular simplex are constructed to serve as the isotropic hyper sphere neighborhoods.A linear transformation can be learned by least square regression to model the relationship between data space and feature space.Two methods are presented to realize IHSN.One is IHSN-ML,which is based on manifold learning;the other is IHSN-KL,which is based on Kullback-Leibler divergence.IHSN-ML has closed solution and fast computation.Meanwhile,IHSN-KL is more accurate both in classification and explanation.Experimental results on IRIS and PIE-CMU show competence of the proposed methods.
出处
《计算机学报》
EI
CSCD
北大核心
2014年第11期2256-2261,共6页
Chinese Journal of Computers
基金
国家自然科学基金(61170109
61272007
61100119)
浙江省自然科学基金(Y14F030022
LY12F02009
LY13F020015)
浙江省科技厅项目(2012C21021)资助~~
关键词
流形学习
内蕴结构
降维
有监督学习
manifold learning
intrinsic structure
dimensionality reduction
supervised learning