摘要
提出了一种新的基于决策树的组合分类器学习方法FL(Forest Learning)。与bagging和adaboost等传统的组合分类器学习方法不同,FL不采用抽样或加权抽样,而是直接在训练集上学习一个森林作为组合分类器。与传统组合学习方法独立地学习每个基分类器,然后把它们组合在一起的做法不同,FL学习每个基分类器时都尽可能地考虑对组合分类器的影响。首先,FL使用传统的方法构建森林的第一棵决策树;然后,逐一构建新的决策树并将其添加到森林中。在构建新的决策树时,结点的每次划分都考虑对组合分类器的影响。实验结果表明,与传统的组合分类器学习方法相比,FL在大部分数据集上都能构建出性能更好的组合分类器。
This paper proposed a new decision tree-based ensemble learning method called FL(Forest Learning). Unlike traditional ensemble learning approaches, such as bagging and boosting, FL directly learns a forest on all training exam- ples as an ensemble rather than on examples obtained by sampling from training set. Unlike the approach of learning en- semble by independently training each classifier and combining them for prediction, FL learns each classifier considering its influence on ensemble performance. FL first employs traditional algorithm to train the first decision tree, and then it- eratively constructs new decision trees and add them to forest. When constructing current decision tree,FL considers the influence of each partition on ensemble performance. Experimental results indicate that, compared to traditional ensem- ble learning methods,FL induces ensemble with much better performance.
出处
《计算机科学》
CSCD
北大核心
2014年第7期283-289,共7页
Computer Science
基金
863项目:大规模汉语词义知识相关特征提取与构建工程(2012AA011101)
河南科技厅重点项目:基于自适应蚁群算法的传感器网络节能覆盖研究(12A520035)资助
关键词
森林学习
边界理论
贡献增益
特征变换
Forest learning,Margin-based theory,Contribution gain,Feature transformation