摘要
属性选择是机器学习与模式识别中进行数据预处理的一个重要方法,特别是针对一些高维的数据集,其计算复杂度较高,对数据挖掘算法的性能影响较大.因此,文章在连续型萤火虫算法(GSO)基础上对萤火虫进行二进制编码,并结合修正后的sigmoid函数,提出一种基于二进制萤火虫算法的属性选择方法.该方法以数据集分形维数作为属性子集的评价准则,以二进制萤火虫算法作为搜索策略,通过对标准数据集UCI进行一系列实验,实验结果表明了该方法的有效性与可行性.
Attribute selection is an important method of data preprocessing in the fields of machine learning and pattern recognition. Especially, there are some high dimensional data sets which their computational complexity is so high that they greatly affect the performance of mining algorithm. Therefore, a new feature selectionmethod based on binary glowworm swarm optimization algorithm is proposed, which combines improved sigmoid function with the thought of fractal dimension. In this method, fractal dimension is taken as the evaluation criteria for attribution subsets and binary glowworm swarm optimization algorithm as a kind of search strategy. To verify the feasibility and effectiveness of the proposed method, UCI datasets are used in the experiments.
出处
《系统科学与数学》
CSCD
北大核心
2017年第2期407-424,共18页
Journal of Systems Science and Mathematical Sciences
基金
国家自然科学基金(71271071,71490725,91546108)
国家863计划项目(2015AA042101)
安徽省教育厅自然科学重点项目(KJ2016A308)资助课题
关键词
属性选择
分形维数
群智能优化算法
二进制萤火虫算法
Attribute selection, fractal dimension, swarm intelligence optimization,binary glowworm swarm optimization.