期刊文献+

面向概念漂移数据流的在线集成自适应算法

Online ensemble adaptive algorithm for concept drift of streaming data
下载PDF
导出
摘要 概念漂移是流数据的主要特征之一,如何检测概念漂移的发生以及调整预测模型去适应概念漂移现象备受研究者的关注.目前有关概念漂移的大多数算法仅仅针对单一类型的概念漂移检测,并且需限制输入数据服从某一分布,所以在检测多种类型概念漂移时效果不理想.提出一种在线集成自适应算法(KSHPR),在自适应随机森林(Adaptive Random Forests,ARF)算法和流随机补丁(Streaming Random Patch,SRP)算法的基础上进行优化改进,采用非参数检验与滑动窗口相结合的策略进行概念漂移检测,降低窗口平均值对算法性能的影响,并以此为基础建立四个基学习者的集成学习模型,根据基学习者预测准确率,动态分配权值,有效解决流式数据中学习模型精度低的问题.实验证明,提出的算法在真实数据集和合成数据集中均表现优良,与其他算法相比,该算法的稳定性、分类准确性与多类型概念漂移适应能力均有所提升. Concept drift is one of the main characteristics of stream data. How to detect the occurrence of concept drift and adjust the prediction model to adapt to the phenomenon of concept drift has attracted attention of researchers. At present,most algorithms about concept drift only aim at single type of concept drift detection,and need to restrict input data to obey a certain distribution,which makes the effect of detecting multiple types of concept drift unsatisfactory. An online ensemble adaptive algorithm(KSHPR) is proposed,which is optimized and improved on the basis of Adaptive Random Forest(ARF)and Streaming Random Patch(SRP) algorithms,adopting the strategy of combining non-parametric test and sliding window for concept drift detection,and reducing the influence of the window average on the performance of the algorithm. Based on this,we establish an ensemble learning model of four basic learners,which dynamically allocates weights according to the prediction accuracy of the basic learners,and effectively solves the problem of low accuracy of the learning model in streaming data. Experimental results show that the proposed algorithm performs well in both real datasets and synthetic datasets.Compared with other algorithms,the stability,classification accuracy and multi-type concept drift adaptability of the algorithm are improved.
作者 崔瑞华 綦小龙 刘艳芳 林玲 Cui Ruihua;Qi Xiaolong;Liu Yanfang;Lin Ling(Department of Network Security and Information Technology,Yili Normal University,Yining,835000,China;College of Mathematics and Information Engineering,Longyan University,Longyan,364012,China;State Key Laboratory for Novel Software Technology,Department of Computer Science and Technology,Nanjing University,Nanjing,210023,China)
出处 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2023年第1期134-144,共11页 Journal of Nanjing University(Natural Science)
基金 新疆维吾尔自治区自然科学基金(2021D01C466,2021D01C467)。
关键词 流数据 概念漂移 在线学习 集成 stream data concept drift online learning ensemble learning
  • 相关文献

参考文献5

二级参考文献19

共引文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部