摘要
为克服ML-KNN在分类效率方面的局限性,提出了一种基于KNN的快速多标签数据分类算法FKMC,利用待分类实例的k个最近邻的局部信息进行排序分类。从已分类数据实例集中选择待分类数据实例的k个最近邻;根据每个最近邻拥有的标签数和每个标签归属的最近邻数对待分类实例进行排序分类。仿真结果表明,最近邻的选择方法对分类器性能有显著的影响;在分类效果上FKMC与ML-KNN相当,有时甚至优于后者;而在分类效率上FKMC则显著优于ML-KNN。
For overcoming the limitation of ML-KNN on the aspect of categorization efficiency, a fast KNN-based multi-label categorization algorithm named FKMC is proposed, where the local information of the k-nearest neighbors of those unclassified data instances is used to do ranking categorization on these instances.To an unclassified instance, its k-nearest neighbors are selected from the classified instances set in the first step, and then ranking categorization on it is done in light of the number of labels assigned to each nearest neighbor and the number of the nearest neighbors owning each label.Simulation results show that the method of selecting the nearest neighbors affects the performance of a classifier obviously.The categorization effect of FKMC is similar with that of ML-KNN in most cases, and sometimes the former is even better.While on the aspect of categorization efficiency,FKMC outperforms ML-KNN remarkably.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第32期138-140,190,共4页
Computer Engineering and Applications
基金
教育部人文社科基金(No.09YJAZH072)
关键词
最近邻
快速分类
多标签数据
快速多标签数据分类算法(FKMC)
nearest neighbors
fast classifying
multi-label data
Fast K-nearest neighbors based Multi-label Categorization(FKMC)