摘要
集成算法是机器学习领域的研究热点。随机子空间算法是集成算法的一个主要算法。随机子空间生成的特征子集可能含有冗余特征、甚至噪声特征,影响算法的分类精度。为此,本文提出了一种基于互信息的弱随机特征子空间生成算法(WRSMI),有效去除了特征子集中的冗余特征和噪声特征。在UCI数据集上的实验结果表明,WRSMI的分类性能优于随机子空间算法。
The ensemble algorithm is a hot research field of machine learning. Random subspace algorithm is a main algorithm of ensemble algorithm. Feature subset generated by random subspace may contain redundant feature and even noise feature, affecting the classification accuracy. Therefore, in this paper, weak random subspace based on mutual information (WRSMI) algorithm is introduced. WRSMI effectively eliminates the redundancy and noise feature of feature subspace. The experimental results on UCI datasets show that, WRSMI classification performance is better than random subspace algorithm.
出处
《南阳理工学院学报》
2012年第2期24-29,共6页
Journal of Nanyang Institute of Technology
关键词
集成学习
随机子空间
互信息
分类性能
特征子集
ensemble learning
random subspace
mutual information
classification performance
feature subset