摘要
针对传统K-NN分类方法预测效率低的问题,提出一种基于待测样本标记的加速K-NN分类(Speeding K-NN Classification Based on Testing Sample Label,KNN_TSL)方法。该方法首先采用传统K-NN分类方法得到一定数量的待测样本类别;然后对于再进入的待测样本,计算其与已标记类别待测样本的距离,如果该距离小于给定的阈值,则将该新进入的样本赋予相同的类别标签,反之则重新分类。这种方法对于后续进入的易分类待测样本,只需要计算其与少数比原始标记样本更有代表性的已标记待测样本的距离即可进行类别决策,而只有少数的待测样本需要重新分类。由于已标记待测样本包含了部分类别信息,因此采用这种方法可以在大大提高分类预测效率的同时保证模型的泛化性能。实验结果表明,本文提出的KNN_TSL方法能够获得较高的样本预测速度和较好的预测准确率。
To solve the problem of the low prediction efficiency of traditional K-NN classification,this paper presents a speeding K-Nearest Neighbor( K-NN) classification method based on testing sample label( KNN_TSL). Firstly,a certain number of testing samples is obtained by traditional K-NN classification method. Then for the samples to be entered latterly,the distance between the labeled samples and the testing sample is calculated. If the distance is less than a given threshold,the new entry sample is assigned the same class label. Otherwise,the K-NN classification is performed. By this method,most last easily classified samples can be decided only by considering the relationship of it with the labeled testing samples,and only a small number of samples is reclassified. Because the labeled samples contain some information of class,this method can greatly improve the classification prediction efficiency and ensure the generalization performance. The experiment result demonstrates that the proposed KNN_TSL model can obtain the high learning efficiency and testing accuracy simultaneously.
出处
《计算机与现代化》
2017年第9期102-105,共4页
Computer and Modernization