摘要
提出了一种基于随机子空间的多分类器集成算法RFSEn.首先选择一个合适的子空间大小,然后随机选择特征子集并投影,并得到子空间上的基分类器,从而通过基分类器构成集成分类器,并由集成分类器来进行文本的分类.将该算法与单一分类器和基于重抽样技术的bagging算法进行了比较,在标准数据集上进行了实验.结果表明,该方法不仅优于单一分类器的分类性能,而且一定程度上优于bagging算法.
In this paper,we propose an ensemble algorithm called RFSEn which is based on random feature subspaee. First, an appropriate feature subset size is selected, then subsets of features are randomly and projected on the training set, and the primary classifiers of subspace are obtained, and thus ensembled classifiers are formed with these primary classifiers. At last, we use the ensembled classifier to classify the text. We compare the algorithm with bagging algorithm which is based on re-sampling techniques and single classifier on the standard datasets. The results show that RFSEn al- gorithm is not only superior to single classifier in performance, but better than bagging algorithm in some degree.
出处
《南京师范大学学报(工程技术版)》
CAS
2008年第4期87-90,共4页
Journal of Nanjing Normal University(Engineering and Technology Edition)
关键词
随机子空间
分类器集成
重抽样
random sub-space, classifier ensemble, re-sampling