摘要
动态非平衡数据分类是在线学习和类不平衡学习领域重要的研究问题,用于处理类分布非常倾斜的数据流。这类问题在实际场景中普遍存在,如实时控制监控系统的故障诊断和计算机网络中的入侵检测等。由于动态数据流中存在概念漂移现象和不平衡问题,因此数据流分类算法既要处理概念漂移,又要解决类不平衡问题。针对以上问题,提出了在检测概念漂移的同时对非平衡数据进行处理的一种方法。该方法采用Kappa系数检测概念漂移,进而检测平衡率,利用非平衡数据分类方法更新分类器。实验结果表明,在不同的评价指标上,该算法对非平衡数据流具有较好的分类性能。
Online class imbalance learning is an important research problem in the field of online learning and class imbalanced learning.It is used to process data streams with much skewed class distribution.Such problems are common in practical scenarios,such as fault diagnosis of real-time control monitoring systems and intrusion detection in computer networks.Due to the concept drift phenomenon and imbalance problem in the dynamic data streams,the algorithm not only deals with concept drift,but also solves class imbalance problems.In view of the above problems,a method for processing imbalanced data streams while detecting concept drift is proposed.This algorithm uses Kappa coefficient to detect the concept drift,and then detects the balance rate,and finally updates the classifier.Experimental results show that the algorithm has better classification performance for imbalanced data streams on different evaluation indexes.
作者
王俊红
郭亚慧
WANG Junhong;GUO Yahui(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing,Ministry of Education,Taiyuan 030006,China)
出处
《计算机工程与应用》
CSCD
北大核心
2021年第13期124-129,共6页
Computer Engineering and Applications
基金
国家自然科学基金(61772323)
山西省自然科学基金(201701D121051)。
关键词
数据流
非平衡数据
概念漂移
Kappa系数
分类算法
data streams
imbalanced data
concept drift
Kappa coefficient
classification algorithm