期刊文献+

数据流滑动窗口方式下的自适应集成分类算法 被引量:4

Adaptive ensemble algorithm based on sliding windows model for data streams
下载PDF
导出
摘要 针对基于数据块的集成算法,存在数据块大小影响分类效果,且不能及时应对完整式概念漂移的问题,提出了一种考虑数据流局部特征的和能应对多种类型概念漂移的集成分类算法.用滑动窗口作为概念漂移检测器,当检测到概念漂移时,则建立新的分类器并加入到集成分类器中.本文提出的算法在人工合成和真实数据集上与经典算法进行了广泛的对比实验.结果表明:提出的算法在分类准确率上具有明显优势,消耗更少的内存,更适合多种类型概念漂移的环境. The main drawback of block-based ensembles is the difficulty of tuning the block size to offer a compromise between fast reactions to drifts. Motivated by this challenge, an adaptive en- semble for evolving data streams is proposed to deal with different types of drift. The algorithm uses the adaptive window algorithm as a change detector. When a change is detected, the worst classifier of the ensemble is removed and a new is added. The proposed algorithm is experimental- ly compared with the state-of-the-art algorithms on synthetic and real datasets. Out of all the compared algorithms, the proposed algorithm provided higher classification accuracy while pro- ving to be less memory consuming than other approaches. Experimental results show that the proposed algorithm can be considered suitable for scenarios, involving different types of drift as well as static environments.
出处 《北京交通大学学报》 CAS CSCD 北大核心 2016年第5期9-15,共7页 JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金 国家自然科学基金资助项目(61572005) 北京市自然科学基金资助项目(4142042) 信阳师范学院青年骨干教师资助计划项目资助(2016GGJS-08)
关键词 数据挖掘 数据流 概念漂移 集成分类器 滑动窗口 data mining data streams concept drift ensemble classifier sliding windows
  • 相关文献

参考文献4

二级参考文献163

  • 1杨宜东,孙志挥,张净.基于核密度估计的分布数据流离群点检测[J].计算机研究与发展,2005,42(9):1498-1504. 被引量:8
  • 2钱江波,徐宏炳,董逸生,王永利,刘学军,杨雪梅.基于最小生成树的数据流窗口连接优化算法[J].计算机研究与发展,2007,44(6):1000-1007. 被引量:3
  • 3H Wang, et al. Mining concept-drifting data streams using ensemble classifiers[ A ]. Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C] .New York: ACM Press,2003.226- 235.
  • 4M Scholz, R Klinkenberg. An ensemble classifier for drifting concepts[ A]. Proceedings of the Second International Work- shop on Knowledge Discovery in Data Streams [ C]. Porto, Portugal: Springer,2005.53 - 64.
  • 5Wei Fan. Systematic data selection to mine concept - drifting data streams[A]. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C] .New York: ACM Press,2004. 128- 137.
  • 6J Z Kolter, M A Maloof. Using additive expert ensembles to cope with concept drift [ A]. Proceedings of the 22nd International Conference on Machine Learning[C]. New York: ACM Press, 2005.449 - 456.
  • 7G M Weiss, F Provost. Learning when training data are costly: the effect of class distribution on tree induction[ J]. JOUlllal of Artificial Intelligence Research, 2003, (19) : 315 - 354.
  • 8N V Chawla, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, (16) :321 - 357.
  • 9G M Weiss. Mining with rarity: a unifying framework[ J]. ACM SIGKDD Explorations, 2004,6( 1 ) :8 - 19.
  • 10C Elkan. The foundations of cost - sensitive learning[A]. Proceedings of the 17th International Joint Conference on Artificial Intelligence[C]. Seattle, Washington, USA: Morgan Kaufinann Publishers Inc, 2001. 973 - 978.

共引文献97

同被引文献26

引证文献4

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部