摘要
文章针对传统K-近邻分类方法学习效率低下的问题,提出一种基于并行计算的加速K-近邻分类方法(K-nearest neighbor classification method based on parallel computing,PKNN),即并行K-近邻分类.该方法首先将所需要分类的样本划分为不同的工作子集,然后在每个子集上进行并行的K-近邻分类.由于划分后每个工作子集的规模均远小于整个数据集的规模,因此降低了分类算法的复杂度,可有效处理大规模数据的分类问题.实验结果表明,PK-NN方法能提高分类效率.
To solve problems that traditional K-nearest neighbor classification algorithm can not solve the large scale dataset classification problem,this paper presents a speeding K-NN classification method based on parallel computing,called PK-NN algorithm.The large scale classification samples are divided into some working subsets with independent identical distribution and the traditional K-NN classification method is executed on every working subset.The size of every working subset is smaller than the original classification samples set,so the complexity of classification algorithm is decreased and can solve the large scale classification problem.Simulation results demonstrate that the PK-NN algorithm can obtain the high classification efficiency.
出处
《太原师范学院学报(自然科学版)》
2014年第4期44-46,79,共4页
Journal of Taiyuan Normal University:Natural Science Edition
关键词
K-近邻分类
并行计算
并行K-近邻分类
工作子集
K-nearest neighbor classification
parallel computing
parallel K-NN classification
working subset