摘要
利用集成模型可以应对实时数据流分类问题中的概念漂移.许多经典集成算法都是通过对数据采样,或者通过对概念漂移的检测从而进行集成模型的更新来应对数据流种产生的概念漂移问题的.如何使得模型可以及时的在当前的概念上迅速建立模型一直是在线数据流学习关注的问题.本文使用增量学习和迁移学习的思想提出了一种新的历史模型自适应概念漂移的数据流集成分类算法HAEL,在集成模型中引入注意力机制,可以始终优先关注当前的数据来构建和更新分类模型,并且提出通过利用准确率比较范围参数来调整模型对当前数据的关注程度,从而使得模型更好的应对概念漂移.通过在四种类型的概念漂移数据集上的实验表明,HAEL与传统算法相比均表现出更高的分类准确率.
Real-time data stream classification with concept drift has often been solved by ensemble methods.Many classical ensemble algorithms deal with the concept drift problem of data streams by sampling data or updating the ensemble model by concept drift detection.How to build models in time and quickly in current concepts has always been the focus of online data stream learning.This paper proposes a new data stream ensemble classification algorithm HAEL based on incremental learning and transfer learning.By introducing attention mechanism into the ensemble model,we can always give priority to the construction and updating of the current data classification model,and propose to adjust the degree of attention on the model to the current data by comparing the range parameters of accuracy.This makes the model better cope with concept drift.Experiments on four types of concept drift datasets show that HAEL has higher classification accuracy than traditional algorithms.
作者
吕艳霞
刘波男
王翠荣
王聪
万聪
LV Yan-xia;LIU Bo-nan;WANG Cui-rong;WANG Cong;WAN Cong(School of Computer Science and Engineering,Northeastern University,Shenyang 110004,China;School of Computer and Communication Engineering,Northeastern University at Qinhuangdao,Qinhuangdao 066004,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2019年第12期2624-2630,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61702089,61876205,61501102)资助
河北省自然科学基金项目(F2016501079)资助
关键词
数据流分类
概念漂移
集成模型
注意力机制
data stream classification
concept drift
ensemble model
attention mechanism