期刊文献+

复杂数据流在线集成分类算法综述

Survey of online ensemble classification algorithms for complex data streams
下载PDF
导出
摘要 复杂数据流中所存在的概念漂移及不平衡问题降低了分类器的性能。传统的批量学习算法需要考虑内存以及运行时间等因素,在快速到达的海量数据流中性能并不突出,并且其中还包含着大量的漂移及类失衡现象,利用在线集成算法处理复杂数据流问题已经成为数据挖掘领域重要的研究课题。从集成策略的角度对bagging、boosting、stacking集成方法的在线版本进行了介绍与总结,并对比了不同模型之间的性能。首次对复杂数据流的在线集成分类算法进行了详细的总结与分析,从主动检测和被动自适应两个方面对概念漂移数据流检测与分类算法进行了介绍,从数据预处理和代价敏感两个方面介绍不平衡数据流,并分析了代表性算法的时空效率,之后对使用相同数据集的算法性能进行了对比。最后,针对复杂数据流在线集成分类研究领域的挑战提出了下一步研究方向。 The concept drift and imbalance problems in complex data streams reduce the performance of classifiers.Traditional batch learning algorithms need to consider factors such as memory and runtime,but their performance is not outstanding in ra-pidly arriving massive data streams,and they also contain a large number of drift and class imbalance phenomena.Utilizing online ensemble algorithms to handle complex data stream problems has become an important research topic in the field of data mining.Firstly,this paper introduced and summarized the online versions of bagging,boosting,and stacking ensemble methods from the perspective of ensemble strategies,and compared the performance of different models.Secondly,this paper conducted a detailed summary and analysis of online ensemble classification algorithms for complex data streams for the first time.This paper introduced conceptual drift data stream detection and classification algorithms from two aspects:active detection and passive adaptation.This paper introduced unbalanced data streams from two aspects:data preprocessing and cost sensitivity.This paper analyzed the spatiotemporal efficiency of representative algorithms,and then compared the performance of algorithms using the same dataset.Finally,this paper proposed the next research direction in response to the challenges in the field of online ensemble classification of complex data streams.
作者 李春鹏 韩萌 孟凡兴 何菲菲 张瑞华 Li Chunpeng;Han Meng;Meng Fanxing;He Feifei;Zhang Ruihua(School of Computer Science&Engineering,North Minzu University,Yinchuan 750021,China)
出处 《计算机应用研究》 CSCD 北大核心 2024年第3期641-651,共11页 Application Research of Computers
基金 国家自然科学基金资助项目(62062004) 宁夏自然科学基金资助项目(2022AAC03279)。
关键词 在线学习 集成学习 概念漂移 不平衡 online learning ensemble learning concept drift imbalance
  • 相关文献

参考文献6

二级参考文献26

共引文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部