摘要
为有效解决概念漂移数据流分类问题,提出一种基于混合集成学习的概念漂移数据流分类方法。考虑数据分布特性与概念漂移速率这两个因素,将概念漂移的成因考虑到模型的构建中。采用混合集成学习框架,根据贝叶斯分类错误率来检测概念漂移,通过动态调整滑动窗口,实现不同类型概念漂移的自动识别。实验结果表明,对于不同类型概念漂移数据流的识别问题,该算法在抗噪和漂移检测方面均表现出良好的性能。
To solve the concept drift data stream classification problem effectively ,a new method based on hybrid integrated learning was proposed .This method focuses on the concept of data distribution characteristics and the drift rate ,and takes the causes of concept drift into account .A hybrid integrated learning framework was adopted ,the concept drift was detected based on Bayesian classification error rate ,and different types of concept drift were automatically identified through dynamic adjustment on the sliding window .Experimental results show that the proposed method has the better performance on concept drift data stream identification problem in both the noise and the drift tests .
出处
《计算机工程与设计》
CSCD
北大核心
2014年第10期3489-3492,3553,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(10771092)
辽宁省教育厅基金项目(L2011186)
关键词
概念漂移
数据流
滑动窗口
贝叶斯分类器
混合集成学习
concept drift
stream data
sliding window
Bayesian classifier
hybrid integrated learning