摘要
由于传统的KNN算法需要针对不同的数据集选择不同的k值的缺陷,提出了两种自适应近邻值的检测算法。该算法以传统的KNN算法为基础,使用多个K值对数据进行多次分类,而后对多次分类结果进行统计,根据统计值来决定样本点的类归属。方法一为统计多次分类中每个类别所包含的近邻数目,将近邻数目最多的类作为样本点的归属类;方法二为统计多次分类中的归属类数目,将数目最多的作为样本点的归属类,两种方法可以避免每次设置K值的弊端。从实验结果可以看出,提出的算法得到的数据更加稳定,更具有代表性。
Since the value of k cannot be determined by the traditional k - nearest neighbor algorithm, a novel k - nearest neighbor improved algorithm is proposed. This algorithm is based on the traditional k - nearest neighbor algorithm, but the results are handled in different ways. We choose different values of k for each tasting data. Then, we count the numbers of the data belonging to each cluster. Finally, the test data belongs to the cluster that has the maximum of numbers. Compared with the traditional k - nearest neighbor algorithm, this improved algorithm avoids the disadvantage that how to choose the best value of k due to the great influence of different values of k on classification results. The innovation of our algorithm can be sider the problem of the real value of k. According stable than the traditional algorithm with the result adaptive to the classification results. This algorithm does not conto the results of experiment, the improved algorithm is much more of classification closer to the real value.
出处
《电子科技》
2017年第7期29-32,共4页
Electronic Science and Technology
关键词
KNN算法
分类
自适应
k - nearest neighbors
classification
adaptive