摘要
传统的K近邻(KNN)分类算法在实际应用过程中存在一些缺陷:没有考虑去除噪声样本,也没有考虑到在样本数据空间变换过程中保持样本数据本身的流形学结构,并且没有使用样本间属性的相关性。为此,提出引入稀疏学习理论,利用训练样本重构测试样本的方法,重构过程使用了样本间的相关性,也用到局部保持投影LPP保持数据结构不变,同时引入l2,1范数用于去除噪声样本的方法来寻找投影变换矩阵W,进而利用W确定KNN算法中K值的SA-KNN算法。在UCI数据集上的仿真实验结果表明,该方法比传统的KNN分类算法和Entropy-KNN算法有更高的分类准确度。
Traditional K Nearest Neighbors (KNN) classification method has drawbacks such as no elimination of noise samples, no manifold structure preservation of the samples, and no consideration of the correlation between samples. To solve these problems, we propose an efficient SA-KNN algorithm with adaptive K value. Sparse learning theory is introduced and we reconstruct each test sample with the training samples for KNN classification. We introduce an l2,1 norm to remove the noisy samples,employ the Locality Preserving Projections (LPP) to keep the data structures,and makes the best use of the correlation between the samples in the reconstruction process. With these technologies we can get the transformation matrix W and in turn determine the value of K. Simulation results on the UCI data sets demonstrate a better classification accuracy than the traditional KNN and the Entropy-KNN method.
出处
《计算机工程与科学》
CSCD
北大核心
2015年第10期1965-1970,共6页
Computer Engineering & Science
基金
国家自然科学基金资助项目(61170131和61263035)
国家863计划资助项目(2012AA011005)
国家973计划资助项目(2013CB329404)
广西自然科学基金资助项目(2012GXNSFGA060004)
广西八桂创新团队和广西百人计划资助
广西研究生教育创新计划项目(YCSZ2015095
YCSZ2015096)
关键词
K近邻分类
相关性
去除噪声样本
局部保持投影
稀疏学习
K nearest neighbor (KNN) classification
correlation
removal of noise samples
locality preserving projection
sparse learning