摘要
模糊k-最近邻(fuzzy k-nearest neighbor,FkNN)及其改进的分类方法忽略了样本存在分布不均匀以及噪声样本的情况,不能充分体现每个类样本特征的差异性,影响了分类的准确率.为此,提出了一种基于紧密度的模糊加权kNN数据分类方法.首先基于样本间紧密度计算样本的隶属度;然后根据特征的模糊熵值分别计算每个类样本特征的权重,并使用加权欧氏距离确定近邻训练样本;最后根据待分类样本所属的每个类别的隶属度确定其类别.对UCI多个数据集的实验结果表明该方法是有效的.
In sample classification, the fuzzy k-nearest neighbor (FkNN) method and the associate improved classification algorithms ignore the uneven distribution of samples and the noise samples, thus are unable to reflect the differences of class sample features, resulting in the low classification accuracy. In order to overcome the limitations, a fuzzy weighted k-nearest neighbor data classification method based on affinity is proposed in this paper. Firstly, the membership of samples is calculated based on affinity among samples. Then, the feature weights of class samples are determined by the fuzzy entropy values, and k-neighbors are selected according to the weighted Euclidean distance. Finally, the samples will be classified according to the fuzzy membership of the samples belong to each class. The experimental results on the UCI datasets show that the proposed method is effective.
作者
刘诚诚
姜瑛
LIU Cheng-cheng;JIANG Ying(Yunnan Key Lab of Computer Technology Application,Kunming 650500,China;Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处
《应用科学学报》
CAS
CSCD
北大核心
2018年第4期679-688,共10页
Journal of Applied Sciences
基金
国家自然科学基金(No.61462049
No.61063006
No.60703116)
云南省应用基础研究计划重点项目基金(No.2017FA033)资助
关键词
数据分类
加权kNN
紧密度
模糊隶属度
模糊熵
data classification
weighted kNN
affinity
fuzzy membership
fuzzy entropy