基于基学习器差异度的层次化Bagging集成修剪

Hierarchical Bagging Ensemble Pruning Based on the Diversity of Base Learners

下载PDF

导出

摘要本文主要目的是寻找到Bagging的一种快速修剪方法,以缩小算法占用的存储空间、提高运算速度和实现提高分类精度的潜力;还提出一种直接计算基学习器差异度的新选择性集成思想.选择出基学习器集合中对提升其余基学习器差异度能力最强者进行删除,通过层次修剪来加速这一算法.在不影响性能的基础上,新算法能够大幅度缩小Bagging的集成规模;新算法还支持并行计算,其进行选择性集成的速度明显优于GASEN.本文还给出了集成学习分类任务的误差上界. The main objective of this paper is to find a rapid pruning method for Bagging to reduce the storage space needed by the algorithm, speed up the computation process and obtain the potential of improving the classification accuracy. A new idea of selective ensemble is proposed, which computes file diversity of base learners directly. The base learner which has the strongest ability to improve the diversity of other base learners in the base learner set is chosen and deleted, and hierarchical pruning is used to speed up the new algorithm. The new algorithm can greatly reduce the size of the bagging ensemble without performance degradation. It also supports parallel computing and its selective ensemble speed is much faster than that of GASEN （genetic algorithm based on selected ensemble）. The upper bound of classification error of ensemble learning is given.

作者谢元澄杨静宇

机构地区南京理工大学计算机与技术学院

出处《信息与控制》 CSCD 北大核心 2009年第4期449-454,共6页 Information and Control

关键词选择性集成差异度层次修剪并行计算基学习器个体学习器 selective ensemble diversity hierarchical pruning parallel computation base learner component learner

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1Dietterich T G. Experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization[J]. Machine Learning, 2000, 40(2): 139-157.
2Martinez-Munoz G, Suarez A. Switching class labels to generate classification ensembles[J]. Pattern Recognition, 2005, 38(10): 1483-1494.
3Martinez-Munoz G, Suarez A. Using boosting to prune bagging ensembles[J]. Pattern Recognition Letters, 2007, 28(1): 156-165.
4Zhou Z H, Wu J X, Tang W. Ensembling neural networks: Many could be better than all[J]. Artificial Intelligence, 2002, 137(1- 2): 239-263.
5Zhou Z H, Tang W. Selective Ensemble of Decision Trees[R]. Nanjing: Nanjing University, 2003.
6Giacinto G, Roli E An approach to the automatic design of multiple classifier systems[J]. Pattern Recognition Letters, 2001, 22(1): 25-33.
7李凯,黄厚宽.一种基于聚类技术的选择性神经网络集成方法[J].计算机研究与发展,2005,42(4):594-598. 被引量：24
8Tamon C, Xiang J. On the boosting pruning problem[A]. Proceedings of the 1 lth European Conference on Machine Leaming[C]. Berlin, Germany: Springer, 2000. 404-412.
9Breiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140.
10Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning[A]. Advances in Neural Information Processing Systems 7[M]. Cambridge, MA, USA: MIT Press, 1995. 231-238.

二级参考文献20

1L.K. Hansen, P. Salamon. Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intelligence, 1990, 12(10): 993～1001.
2P. Sollich, A. Krogh. Learning with ensembles: How over-fitting can be useful. In: D. Touretzky, M. Mozer, M. Hasselmo, eds.Advances in Neural Information Processing Systems, Vol 8.Cambridge, MA: MIT Press, 1996. 190～196.
3L. Breiman. Bagging predictors. Machine Learning, 1996, 24(2): 123～140.
4Y. Freund, R. Schapire. Experiments with a new boosting algorithm. In: Proc. the 13th Int'l Conf. Machine Learning.Bari, Italy: Morgan Kaufmann, 1996.
5A. Krogh, J. Vedelsby. Neural network ensembles, cross validation, and active learning. In: G. Tesauro, D. S.Touretzky, T. K. Leen, eds. Advances in Neural Information Processing Systems 7. Cambridge, MA: MIT Press, 1995. 231～238.
6T. Dietterich, G. Bakin. Solving multiclass learning problems via error-correcting output codes. Journal of AI Research, 1995, 2,263～ 286.
7N. C. Oza, K. Tumer. Dimensionality reduction through classifier ensembles. NASA Ames Research Center, Tech. Rep.:NASA-ARC- IC-1999-126, 1999.
8N. C. Oza, K. Tumer. Input decimation ensembles:Decorrelation through dimensionality reduction. In: J. Kittler, F.Roli, eds. Multiple Classifier Systems. Second InternationalWorkshop (MCS 2001), LNCS 2096. Berlin: Springer, 2001.238～ 247.
9Z. Zheng, G. Webb. Integrating boosting and stochastic attribute selection committees for future improving the performance of decision tree learning. The 10th IEEE ICTAI, Los Alamitos,1998.
10Z.H. Zhou, J. X. Wu, W. Tang. Ensembling neural networks:Many could be better than all. Artificial Intelligence, 2002, 137(1-2): 239～263.

共引文献23

1张慧档,何伟,张红梅.神经网络在手写体字符识别中的应用研究[J].中原工学院学报,2006,17(3):17-21.
2凌霄汉,吉根林.一种快速选择性神经网络集成方法[J].郑州大学学报（理学版）,2006,38(4):69-73. 被引量：2
3于繁华,刘寒冰,谭国金.神经网络集成在结构损伤识别中的应用[J].吉林大学学报（工学版）,2007,37(2):438-441. 被引量：2
4张东波,王耀南.基于粗糙集约简的神经网络集成及其遥感图像分类应用[J].中国图象图形学报,2008,13(3):480-487. 被引量：9
5周涛,张艳宁,袁和金,陆惠玲,李秀秀.基于聚类分析和集成神经网络的序列图像多目标识别算法[J].计算机科学,2009,36(3):215-219. 被引量：4
6谢元澄,杨静宇.删除最差基学习器来层次修剪Bagging集成[J].计算机研究与发展,2009,46(2):261-267. 被引量：9
7崔路阳,魏海坤.一种启发式选择性神经网络集成设计方法[J].东南大学学报（自然科学版）,2010,40(S1):297-301. 被引量：2
8李岩,王东风,韩璞.基于核独立分量分析的模糊核聚类神经网络集成方法[J].计算机应用研究,2009,26(9):3318-3320. 被引量：1
9赵胜颖,高广春.基于蚁群算法的选择性神经网络集成方法[J].浙江大学学报（工学版）,2009,43(9):1568-1573. 被引量：7
10朱群雄,孟庆浩.一种新的选择性神经网络集成方法及其在PTA中的应用[J].化工学报,2009,60(10):2510-2516. 被引量：3

1谢元澄,杨静宇.删除最差基学习器来层次修剪Bagging集成[J].计算机研究与发展,2009,46(2):261-267. 被引量：9
2琚旭,王浩,姚宏亮.支持向量机的一个边界样本修剪方法[J].合肥工业大学学报（自然科学版）,2006,29(7):830-833. 被引量：3
3刘家辰,苗启广,曹莹,宋建锋,权义宁.基于混合多样性生成与修剪的集成单类分类算法[J].电子与信息学报,2015,37(2):386-393. 被引量：9
4李炜,章寅,倪源.基于集成修剪的丁苯橡胶聚合转化率软测量[J].仪器仪表学报,2011,32(1):212-217. 被引量：2
5胡月华.神经网络结构优化设计方法[J].无线电通信技术,2004,30(5):26-27.
6缪裕青.基于频繁模式树的最大频繁模式挖掘算法[J].桂林电子工业学院学报,2004,24(3):23-26.
7李远航,刘波,唐侨.面向多标签图数据的主动学习[J].计算机科学,2014,41(11):260-264. 被引量：1
8Yongping Zhao,Jianguo Sun.Improved scheme to accelerate sparse least squares support vector regression[J].Journal of Systems Engineering and Electronics,2010,21(2):312-317.
9刘宏伟,黄静.基于朴素贝叶斯算法的垃圾邮件网关[J].微计算机信息,2006,22(06X):73-75. 被引量：6
10曹科研,王国仁,韩东红,丁国辉,王爱侠,石凌旭.Continuous Outlier Monitoring on Uncertain Data Streams[J].Journal of Computer Science & Technology,2014,29(3):436-448. 被引量：1

信息与控制

2009年第4期

浏览历史

内容加载中请稍等...

基于基学习器差异度的层次化Bagging集成修剪

参考文献13

二级参考文献20

共引文献23

相关作者

相关机构

相关主题

浏览历史