摘要
特征选择指在保持数据分类性能不变的同时,选出不含冗余特征的特征子集。粗糙超立方体方法可从特征相关度、依赖度和重要度这3方面对特征子集进行综合评估,已成功用于特征选择。特征子集组合的计算是一个NP⁃难问题,而传统的前向搜索策略只能得到局部最优结果。因此,本文设计了一种新的离散粒子群优化与粗糙超立方体方法相结合的算法。该算法首先引入相关度用以生成一组粒子,然后对粗糙超立方体方法的目标函数改进后作为优化函数,最后由粒子群迭代优化,找到最优的特征子集。实验结果表明,相比传统粗糙超立方体方法和采用粒子群优化的粗糙集方法,本文算法能够得到具有更小特征数量和更高分类性能的特征子集。
Feature selection is to choose a subset without containing redundant features,while keeping the classification performance of the data unchanged.Rough hypercuboid approaches can comprehensively evaluate the feature subsets from the three aspects of the relevance,dependency and significance of features,which have been used for feature selection successfully.However,calculating the combination of all feature subsets is NP-hard,and the results obtained by traditional forward search methods is locally optimal.Therefore,a new algorithm based on the rough hypercuboid approach is designed by integrating binary particle swarm optimization.The algorithm first introduces the feature relevance to generate a set of particles,then sets the improved objective function of the rough hypercuboid method as the optimization function,and finally finds the optimal feature subset by iterative optimization of binary particle swarm.By comparing with traditional rough hypercuboid methods and the rough set method based on particle swarm optimization,etc,experimental results demonstrate the proposed algorithm is able to acquire a feature subset with fewer features and higher classification performance.
作者
王思朝
罗川
李天瑞
陈红梅
WANG Sizhao;LUO Chuan;LI Tianrui;CHEN Hongmei(College of Computer Science,Sichuan University,Chengdu 610065,China;School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China)
出处
《数据采集与处理》
CSCD
北大核心
2022年第3期668-679,共12页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(62076171,61573292,61976182)。
关键词
粗糙集
特征选择
组合优化
粗糙超立方体
离散粒子群
rough set
feature selection
combinatorial optimization
rough hypercuboid
binary paticle swarm