摘要
本文通过对鸢尾花数据的研究,提出了一种基于分类器的分类效果差异而进行快速选择的一种改进的Bagging Trees集成算法。并通过同其他统计机器学习方法,如:CART、Bagging Trees、Random Forest以及目前流行的基于遗传算法的选择性集成算法GASEN等比较得出,该算法对于分类问题而言,具有较高的准确率,而且与GASEN算法相比,运行的效率也得到了较大的提高。
In this paper, based on a case study of iris dataset, it draws a new ensemble algorithm, a selective bagging trees ensemble based on diversity of different classifiers. And contrasted with other statistical machine learning methods, such as, CART, bagging trees, random forest and the current prevalent selective ensemble based on genetic algorithm, GASEN, this new algorithm proposed in this paper has higher accuracy, and also costs much little time than GASEN algorithm and improves efficiency when it is used in the problems of classification.
出处
《统计教育》
2008年第6期24-28,共5页
Statistical education
基金
自然科学基金重点项目(#10431010)
教育部重点基地重大项目(#05JJD910001)
中国人民大学应用统计中心的资助
关键词
决策树
自助法
选择性集成
Decision Trees
Bootstrap
Selective Ensemble