期刊文献+

基于集成分类器的数据流分类算法 被引量:3

Data Flow Classification Algorithm Based on Integrated Classifier
下载PDF
导出
摘要 作为一种典型的大数据,数据流具有连续、无限、概念漂移和快速到达等特点,因此传统的分类技术无法直接有效地应用于数据流挖掘。本文在经典的精度加权集成(Accuracy weighted ensemble,AWE)算法的基础上提出概念自适应快速决策树更新集成(Concept very fast decision tree update ensemble,CUE)算法。该算法不仅在基分类器的权重分配方面进行了改进,而且在解决数据块大小的敏感性问题以及增加基分类器之间的相异性方面,有明显的改善。实验表明在分类准确率上,CUE算法高于AWE算法。最后,提出聚类动态分类器选择(Dynamic classifier selection with clustering,DCSC)算法。该算法基于分类器动态选择的思想,没有繁琐的赋权值机制,所以时间效率较高。实验结果验证了DCSC算法的有效和高效性,并能有效地处理概念漂移。 As a typical big data,data stream has the features of continuous,infinite,concept drift and fast arrived.The features make it impossible to apply traditional classification techniques to classify data streams.The paper proposes the concept very fast decision tree(CVFDT)update ensemble(CUE)algorithm based on the classic accuracy weighted ensemble(AWE)algorithm.This algorithm not only improves the weight distribution of the base classifier,but also improves the sensitivity of the block size and the increase of the dissimilarity between base classifiers.Experiments show that,in the classification accuracy,CUE algorithm is higher than the AWE algorithm.Finally,the dynamic classifier selection with clustering(DCSC)algorithm is proposed,which is based on the idea of classifier dynamic selection.The time efficiency is relatively high because there is no tedious weight value mechanism.Experimental results show that the DCSC algorithm can effectively handle the concept of drift and its efficiency is relatively high.
作者 韩东红 马宪哲 李莉莉 王国仁 Han Donghong;Ma Xianzhe;Li Lili;Wang Guoren(School of Computer Science and Engineering,Northeastern University,Shenyang,110819,China)
出处 《数据采集与处理》 CSCD 北大核心 2018年第6期1021-1033,共13页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61173029 61272182 61672144 61332006)资助项目
关键词 数据流 基分类器 集成分类器 决策树 概念漂移 聚类 data streams base classifier ensemble classifier decision tree concept drift clustering
  • 相关文献

参考文献1

二级参考文献24

  • 1王涛,李舟军,颜跃进,陈火旺.数据流挖掘分类技术综述[J].计算机研究与发展,2007,44(11):1809-1815. 被引量:41
  • 2Schlimmer J, Granger R. Incremental learning from noisy data [J]. Machine Learning, 1986, 1(3) : 317-354.
  • 3Alexey T. The problem of concept drift: Definitions and related work [R]. Dublin: University of Dublin, Trinity College, Department of Computer Science, 2004.
  • 4Gama J. A survey on learning from data streams: Current and future trends [J]. Progress in Artificial Intelligence, 2012, 1 (1) : 45-55.
  • 5Gama J, Zliobaite I, Bifet A, et al. A survey on concept drift adaptation [J]. ACM Computing Surveys, 2014, 46(4) :1-35.
  • 6Zliobaite I. Learning under concept drift: An overview [R]. Artificial Intelligence,Vilnius : Vilnius University, 2009.
  • 7Hoens T, Polikar R, Chawla N. Learning from streaming data with concept drift and imbalance.- An overview [J]. Progress in Artificial Intelligence, 2012,1(1).. 89-101.
  • 8Dongre P, Malik L. A review on real time data stream classification and adapting to various concept drift scenarios [C]// Advance Computing Conference. Gurgaon, India: IEEE, 2014: 533-537.
  • 9Katakis I, Tsoumakas G, Vlahavas I. Tracking recurring contexts using ensemble classifiers: An application to email filte- ring [J]. Knowledge and Information Systems, 2010, 22(3) .. 371-391.
  • 10Read J, Bifet A, Pfahringer B, et al. Batch-incremental versus instance-incremental learning in dynamic and evolving data [J]. Advances in Intelligent Data Analysis, 2012, 7619 313-323.

共引文献4

同被引文献30

引证文献3

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部