摘要
为解决数据流分类过程中样本标注和概念漂移问题,提出了一种基于实例迁移的数据流分类挖掘模型.首先,该模型用支持向量机作学习器,用所得分类模型中的支持向量构建源领域,待分类的当前数据块为目标域.然后,借助互近邻思想在源域中挑选目标域中样本的真邻居进行实例迁移,避免发生负迁移.最后,通过合并目标域和迁移样本形成训练集,提高标注样本数量,增强模型的泛化能力.理论分析和实验结果表明,所提方法具有可行性,相比其它学习方法在分类准确性方面更具优势.
To solve the problem of sample labeling and concept drift in the process of data streams classification, we propose an instance-based transfer data streams classification model. First, we use support vector machine as the learning machine in this model. The support vectors constitute the source domain, and the current data block forms the target domain. Then, we select the real neighbors of the target domain from the source domain according to mutual neighbor concept;as a result, the occurrence of negative transfer can be neglected. Finally, we combine the target domain and the transfer sample to form a training set, and this enlarges the number of labeled sample and enhances the generalization ability of the classifier model. Through the analysis of theory and the experiment results, the method is found to be feasible and superior to the other learning methods in terms of classification accuracy.
作者
刘三民
刘余霞
LIU Sanmin;LIU Yuxia(College of Computer and Information, Anhui Polytechnic University, Wuhu 241000, China)
出处
《信息与控制》
CSCD
北大核心
2019年第3期380-384,共5页
Information and Control
基金
国家自然科学基金资助项目(71371012)
安徽省自然科学基金资助项目(1608085MF147)
教育部人文社科基金资助项目(18YJA630114)
安徽省提升计划一般项目(TSKJ2016B05)
关键词
互近邻
迁移学习
数据流分类
增量学习
mutual nearest neighbor
transfer learning
data stream classification
incremental learning