期刊文献+

基于重复度分析的森林优化特征选择算法 被引量:1

Feature selection using forest optimization algorithm based on duplication analysis
下载PDF
导出
摘要 森林优化算法是一种基于森林中树木播种思想的演化算法,其具有良好的特征空间搜索能力,且实现难度低。但该算法在森林整体的收敛速度和寻优能力上仍存在提升空间,而且对高维数据集的适应度较差。本文针对上述问题提出了基于重复度分析的森林优化特征选择算法(feature selection using forest optimization algorithm based on duplication analysis, DAFSFOA)。该算法提出了基于信息增益的自适应初始化策略、森林重复度分析机制、森林重启机制、候选最优树生成策略、综合考虑特征选择数量和分类正确率的适应度函数。实验结果表明,DAFSFOA在大部分数据集上达到了最高的分类准确率。同时,对于高维数据集SRBCT,在维度缩减率和分类准确率方面,DAFSFOA对比森林优化特征选择算法(feature selection using forest optimization algorithm,FSFOA)都有较大提升。DAFSFOA比FSFOA具有更强的特征空间探索能力,而且能够适应不同维度的数据集。 The forest optimization algorithm is an evolutionary algorithm based on the concept of forest tree planting.It has a strong capability for searching for feature space and low implementation difficulty. However, the algorithm still has room for improvement in the convergence speed and merit-seeking ability of the forest as a whole, and it is not well-suited to high dimensional data sets. In this paper, we propose to use a forest optimization algorithm based on duplication analysis(DAFSFOA) to address the above problems. The algorithm proposes an adaptive initialization strategy based on information gain, a forest repetition analysis mechanism, a forest restart mechanism, a candidate optimal tree generation strategy, and an adaptation function that integrates the number of feature selections and the correct classification rate. The experimental results show that DAFSFOA achieves the highest classification accuracy on most datasets. Meanwhile, for the high dimensional dataset SRBCT, DAFSFOA has a large improvement over feature selection using a forest optimization algorithm(FSFOA) in terms of dimensionality reduction rate and classification accuracy. DAFSFOA has a stronger feature space exploration capability than FSFOA and can adapt to datasets with different dimensions.
作者 冀若含 董红斌 JI Ruohan;DONG Hongbin(School of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China)
出处 《智能系统学报》 CSCD 北大核心 2022年第6期1113-1122,共10页 CAAI Transactions on Intelligent Systems
基金 黑龙江自然科学基金资助项目(LH2020F023)。
关键词 特征选择 演化算法 重复度分析 信息熵 信息增益 重启机制 森林优化算法 维度缩减 feature selection evolutionary algorithm duplication analysis information entropy information gain restart mechanism forest optimization algorithm dimensionality reduction
  • 相关文献

参考文献1

二级参考文献1

共引文献18

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部