摘要
在数据挖掘和机器学习领域,维度灾难会降低学习任务的性能,增加学习任务训练时间,而维度灾难最主要的原因是高维数据中冗余特征的存在。特征选择可以提出数据中的冗余特征,降低数据维度,加快模型训练速度并提高性能。在研究粒子群和K近邻算法之后,将两者结合起来提出一个基于K近邻和粒子群优化的特征选择算法,并在5个UCI数据集上验证算法的有效性。
In the fields of data mining and machine learning,dimensional disasters will reduce the performance of learning tasks and increase the training time of learning tasks.The main reason for dimensional disasters is the existence of redundant features in high-dimensional data.Feature selection can propose redundant features in the data,reduce the data dimension,speed up the model training speed and improve performance.After studying the particle swarm optimization and k-nearest neighbor algorithm,this paper proposes a feature selection algo rithm based on k-nearest neighbor and particle swarm optimization.The validity of the proposed algorithm is verified on five UCI datasets.
作者
钟昌康
ZHONG Chang-kang(College of Computer Science,Sichuan University,Chengdu 610065)
出处
《现代计算机》
2020年第9期21-24,40,共5页
Modern Computer
关键词
特则选择
粒子群
数据挖掘
K近邻算法
Feature Selection
Particle Swarm
Data Mining
K-Nearest Neighbor Algorithm