摘要
差异性和平均精度是提高分类器集成性能的两个重要指标。增加差异性势必会降低平均精度,增大平均精度一定会减小差异性。故在差异性和平均精度之间存在一个平衡状态,使得集成性能最优。为了寻找该平衡状态,该文提出融合改进二元萤火虫算法和互补性测度的集成剪枝方法。首先,采用bootstrap抽样方法独立训练出多个基分类器,构建原始基分类器池。其次,采用互补性测度对原始基分类器池进行预剪枝。接着,通过改进萤火虫的移动方式和搜索过程,引入重新初始化机制和跳跃行为,提出改进二元萤火虫算法。最后,采用改进二元萤火虫算法对预剪枝后的基分类器,进行进一步剪枝,选择出集成性能最优的基分类器子集合。在5个UCI数据集上的实验结果表明,较其他方法,使用较少的基分类器,获得了更优的集成性能,具有良好的有效性和显著性。
The key to the success of an ensemble system are the diversity and the average accuracy of base classifiers The increase of diversity among base classifiers will lead to the decrease of the average accuracy, and vice versa. So there exists a tradeoff between the diversity and the average accuracy, which makes the ensemble perform the best with respect to ensemble pruning. To find the tradeoff, Improved Binary Glowworm Swarm Optimization combined with Complementarity measure for Ensemble Pruning (IBGSOCEP) is proposed. Firstly, an initial pool of classifiers is constructed through training independently some base classifiers using bootstrap sampling. Secondly, the classifiers in the initial pool are pre-pruned using complementarity measure. Thirdly, Improved Binary Glowworm Swarm Optimization (IBGSO) is proposed by improving moving way, searching processes of glowworm, introducing re-initialization, and leaping behaviors. Finally, the optimal sub-ensemble is achieved from the base classifiers after pre-pruning using IBGSO. Experimental results on 5 UCI datasets demonstrate that IBGSODSEN can achieve better results than other approaches with less number of base classifiers, and that its effectiveness and significance.
作者
朱旭辉
倪志伟
倪丽萍
金飞飞
程美英
李敬明
ZHU Xuhui;NI Zhiwei;NI Liping;JIN Feifei;CHENG Meiying;LI Jingming(School of Management, Hefei University of Technology, Hefei 230009, China;Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei 230009, China;Business School, Huzhou University, Huzhou 313000, China;School of Management Science and Engineering, Anhui University of Finance and Economics, Bengbu 233030, China)
出处
《电子与信息学报》
EI
CSCD
北大核心
2018年第7期1643-1651,共9页
Journal of Electronics & Information Technology
基金
国家自然科学基金(91546108
71271071
71490725
71301041)
国家重点研发计划(2016YFF0202604)
过程优化与智能决策教育部重点实验室开放课题~~
关键词
萤火虫算法
互补性测度
集成剪枝
Glowworm Swarm Optimization (GSO)
Complementarity measure
Ensemble pruning