摘要
针对基于数据块的集成算法,存在数据块大小影响分类效果,且不能及时应对完整式概念漂移的问题,提出了一种考虑数据流局部特征的和能应对多种类型概念漂移的集成分类算法.用滑动窗口作为概念漂移检测器,当检测到概念漂移时,则建立新的分类器并加入到集成分类器中.本文提出的算法在人工合成和真实数据集上与经典算法进行了广泛的对比实验.结果表明:提出的算法在分类准确率上具有明显优势,消耗更少的内存,更适合多种类型概念漂移的环境.
The main drawback of block-based ensembles is the difficulty of tuning the block size to offer a compromise between fast reactions to drifts. Motivated by this challenge, an adaptive en- semble for evolving data streams is proposed to deal with different types of drift. The algorithm uses the adaptive window algorithm as a change detector. When a change is detected, the worst classifier of the ensemble is removed and a new is added. The proposed algorithm is experimental- ly compared with the state-of-the-art algorithms on synthetic and real datasets. Out of all the compared algorithms, the proposed algorithm provided higher classification accuracy while pro- ving to be less memory consuming than other approaches. Experimental results show that the proposed algorithm can be considered suitable for scenarios, involving different types of drift as well as static environments.
出处
《北京交通大学学报》
CAS
CSCD
北大核心
2016年第5期9-15,共7页
JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金
国家自然科学基金资助项目(61572005)
北京市自然科学基金资助项目(4142042)
信阳师范学院青年骨干教师资助计划项目资助(2016GGJS-08)
关键词
数据挖掘
数据流
概念漂移
集成分类器
滑动窗口
data mining
data streams
concept drift
ensemble classifier
sliding windows