期刊文献+

自然进化策略的特征选择算法研究 被引量:7

Research on Feature Selection Algorithm Based on Natural Evolution Strategy
下载PDF
导出
摘要 特征选择是一种NP-难问题,旨在剔除数据集中不相关及冗余的特征来减少模型训练的时间,提高模型的精确度.因此,特征选择在机器学习、数据挖掘和模式识别等领域中是一种重要的数据预处理手段.提出一种新的基于自然进化策略的特征选择算法——MCC-NES.首先,算法采用了基于对角协方差矩阵建模并通过梯度信息自适应调整参数的自然进化策略;其次,为了使算法有效地处理特征选择问题,在初始化阶段引入了一种特征编码方式;之后,结合分类准确率和维度缩减给出了算法的适应度函数;此外,面对高维数据引入了合作协同进化的思想,将原问题分解为相对较小的子问题并分别对每个子问题独立求解,然后,通过所有子问题相互联系来优化原问题的解决方案;进一步引入分布式种群进化的概念,实现多个种群竞争进化来增加算法的探索能力,并设计了种群重启策略以防止种群陷入局部最优解.最后将提出的算法与几种传统的特征选择算法在一些UCI公共数据集上进行对比实验,实验结果显示:所提出的算法可以有效地完成特征选择问题,并且与经典特征选择算法相比有一定的竞争力,尤其是在处理高维数据时有着出色的表现. Feature selection is an NP-hard problem that aims to improve the accuracy of the model by eliminating irrelevant or redundant features to reduce model training time.Therefore,feature selection is an important data preprocessing technique in the fields of machine learning,data mining,and pattern recognition.This study proposes a new feature selection algorithm MCC-NES based on natural evolutionary strategy.Firstly,the algorithm adopts natural evolutionary strategy based on diagonal covariance matrix modeling,which adaptively adjusts parameters through gradient information.Secondly,in order to enable the algorithm to effectively deal with feature selection problems,a feature coding mechanism is introduced in the initialization phase,and combined with classification accuracy and dimensional reduction,given the new fitness function.In addition,the idea of sub-population cooperative co-evolution is introduced to solve high-dimensional data.The original problem is decomposed into relatively small sub-problems to reduce the combined effect of the original problem scale and each sub-question is solved independently,and then all sub-problems are correlated to optimize the solution to the original problem.Further,applying multiple competing evolutionary populations to enhance the exploration ability of the algorithm and designing a population restart strategy to prevent the population from falling into the local optimal solution.Finally,the proposed algorithm is compared with several traditional feature selection algorithms on some UCI public datasets.The experimental results show that the proposed algorithm can effectively complete the feature selection problem and has excellent performance compared with the classical feature selection algorithm,especially when dealing with high-dimensional data.
作者 张鑫 李占山 ZHANG Xin;LI Zhan-Shan(College of Computer Science and Technology,Jilin University,Changchun 130012,China;Key Laboratory of Symbolic Computation and Knowledge Engineering(Jilin University),Ministry of Education,Changchun 130012,China)
出处 《软件学报》 EI CSCD 北大核心 2020年第12期3733-3752,共20页 Journal of Software
基金 国家自然科学基金(61672261) 吉林省自然科学基金(20180101043JC) 吉林省发展和改革委员会产业技术研究与开发项目(2019C053-9)。
关键词 进化策略 特征选择 合作协同进化 竞争进化 高维 evolution strategy feature selection cooperative co-evolution competitive evolution high-dimensional
  • 相关文献

同被引文献60

引证文献7

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部