摘要
样本距离机制的定义直接影响到KNN算法的准确性和效率。针对传统KNN算法在距离的定义及类别决定上的不足,提出了利用属性值对类别的重要性进行改进的KNN算法(FCD-KNN)。首先定义两个样本间的距离为属性值的相关距离,此距离有效度量了样本间的相似度。再根据此距离选取与待测试样本距离最小的K个近邻,最后根据各类近邻样本点的平均距离及个数判断待测试样本的类别。理论分析及仿真实验结果表明,FCD-KNN算法较传统KNN及距离加权-KNN的分类准确性要高。
Definition of the samples will directly impact on the accuracy and the efficiency of KNN. In view of disadvantages to the traditional KNN algorithm on the distance the definition and categories of decision, proposed the use of attribute importance to category to improve KNN algorithm (FCD-KNN). At first, a distance of the two samples is defined as the correlation distance of the same attribute values. The distance can effectively measure the similarity degree of the two sample. Secondly, According to this distance selects the k nearest neighbors. Finally, the category of the test sample is decided by the average distance and the numbers on the respective category. The theoretical analysis and the simulation experiment show that compared with KNN and-KNN, raised the rate of accuracy enormously in classification.
出处
《计算机科学》
CSCD
北大核心
2013年第11A期157-159,187,共4页
Computer Science
基金
广西教育厅科研基金项目(201106LX577
201106LX604)
国家自然科学基金项目(40971234)
河池学院青年科研项目(2012B-N005
2012B-N007)资助
关键词
KNN算法
相关距离
属性值
样本距离机制
KNN algorithm, Correlation distances, Attribute, Sample distance mechanism