期刊文献+

基于串行交叉混合集成的概念漂移检测及收敛方法 被引量:2

Concept Drift Detection and Convergence Based on Hybrid Ensemble of Serial and Cross
下载PDF
导出
摘要 概念漂移处理大多采用集成学习策略,然而这些方法多数不能及时提取漂移发生后新分布数据的关键信息,导致模型性能较差。针对这个问题,本文提出一种基于串行交叉混合集成的概念漂移检测及收敛方法(Concept drift detection and convergence method based on hybrid ensemble of serial and cross,SC_ensemble)。在流数据处于平稳状态下,该方法通过构建串行基分类器进行集成,以提取代表数据整体分布的有效信息。概念漂移发生后,在漂移节点附近构建并行的交叉基分类器进行集成,提取代表最新分布数据的局部有效信息。通过串行基分类器和交叉基分类器的混合集成,该方法兼顾了流数据包含的整体分布信息,又强化了概念漂移发生时的重要局部信息,使集成模型中包含了较多“好而不同”的基学习器,实现了漂移发生后学习模型的高效融合。实验结果表明,该方法可使在线学习模型在漂移发生后快速收敛,提高了模型的泛化性能。 Concept drift is an important and difficult issue in streaming data mining tasks.At present,the concept drift processing methods adopt the ensemble learning strategy mostly.However,most of these methods cannot extract the key information of the new data distribution after concept drift,leading to poor model performance.To solve this problem,this paper proposes a concept drift detection and convergence method based on hybrid ensemble of serial and cross(SC_ensemble).When streaming data are in a stable state,the method trains serial base classifiers for ensemble learning,to extract effective information representing the overall data distribution.After concept drift occurs,parallel cross base classifiers are constructed near the drift site for ensemble learning,to extract the local effective information representing the latest data distribution.By ensemble learning of serial base classifiers and cross classifiers,the method takes into account the overall distribution information contained in streaming data,and strengthens the important local information when concept drift occurs,so that the ensemble model contains more“good but different”base learners,and realizes the efficient combination of learning models after concept drift.The experimental results show that the proposed method can make the online learning model converge quickly after concept drift,and improve the generalization performance of the model.
作者 郭虎升 高淑花 王文剑 GUO Husheng;GAO Shuhua;WANG Wenjian(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing(Shanxi University),Ministry of Education,Taiyuan 030006,China)
出处 《数据采集与处理》 CSCD 北大核心 2022年第5期997-1011,共15页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(62276157,U21A20513,62076154,61503229) 中央引导地方科技发展资金(YDZX20201400001224) 山西省自然科学基金(201901D111033) 山西省重点研发计划项目(国际合作)(201903D421050)。
关键词 流数据 概念漂移 集成学习 串行分类器 交叉分类器 混合集成 streaming data concept drift ensemble learning serial classifier cross classifier hybrid ensemble
  • 相关文献

参考文献4

二级参考文献9

共引文献42

同被引文献20

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部