期刊文献+

面向噪音和概念漂移数据流的集成分类算法 被引量:8

Ensemble Classification Algorithm for Data Streams with Noise and Concept Drifts
下载PDF
导出
摘要 隐含概念漂移的数据流分类问题是数据挖掘领域研究的热点之一,而实际数据流中的噪音会影响数据流的分类质量,为此,提出一种面向噪音和概念漂移数据流的集成分类算法.该算法使用支持向量机作为基分类器,采用贝叶斯分类器过滤噪音,利用Hoeffding Bounds不等式确定的双阈值检测概念漂移,并动态地更新分类模型以适应数据流环境的变化.实验结果表明,本文提出的算法可以有效地跟踪检测含噪数据流中的概念漂移,并且具有较好的分类精度. The classification problem of concept drifting data streams is a hot topic in the data mining,and noise in real data streams will affect classification quality of data streams,therefore,an ensemble classification algorithm for data streams with noise and concept drifts is proposed in the paper. The algorithm uses support vector machine as the basic classifier,and the Bayesian classifier is adopted to filter noise data,also use dual thresholds determined by Hoeffding bounds inequality to detect concept drifts,and dynamically updates the classification model to adapt to the changes in data streams. Experimental results showthat the proposed algorithm can effectively track and detect concept drifts in noisy data streams,and has better classification accuracy.
出处 《小型微型计算机系统》 CSCD 北大核心 2016年第7期1445-1449,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(51174257 F030504)资助 中央高校基本科研业务费专项资金项目(2013BHZX0040)资助 安徽省教育厅自然科学重点项目(KJ2016A549)资助 阜阳师范学院自然科学项目(2016FSKJ17)资助
关键词 数据流 噪音 概念漂移 分类 集成模型 data streams noise concept drifts classification ensemble model
  • 相关文献

参考文献2

二级参考文献22

  • 1Babcock B, Babu S, Datar M, et al. Models and issues in da ta stream systems [C]//Proc of ACM PODS, 2002:16- 24.
  • 2Tsymbal A. The problem of concept drift : Definitions and re lated work[R]. TCD-CS-2004-15.
  • 3Ireland:Trinity College Dublin, Department of Computer Science, 2004. Huhen G, Spencer L, Domingos P. Mining time-changing data streams [C]//Proc of ACM SIGKDD, 2001:97-106.
  • 4Wang H, Fan W, YU P S, et al. Mining concept-drifting da ta streams using ensemble classifiers [C]//Proe of the 9th ACM SIGKDD International Conference on Knowledge Dis eovery and Data Mining, 2003:226- 235.
  • 5Masud M M, Gao J, Han J, et al. Classification and novel class detection in concept drifting data streams under time constraints[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(6):859-874.
  • 6Zhang P, Zhu X, Tan Jian long, et al. Classifier and cluster ensembles for mining concept drifting data streams [C]// Proc of IEEE International Conference on Data Ming, 2010: 1175-1180.
  • 7Sattar H, Ying Y, Zahra M, et al. Adapted one vs all deci- sion tree for data stream classification [J]. IEEE Transac tions on Knowledge and Data Engineering, 2009, 21 (5) :624- 637.
  • 8Inza I, Larranaga P, Blanco R, et al. Filter versus wrapper gene selection approaches in DNA microarray domains[J]. Artificial Intelligence in Medicine, 2004, 31(2):91-103.
  • 9Lei Y, Huan L. Feature selection for high-dimensional data: A fast correlation based filter solution[C]//Proe of the 20th ICML'03, 2003:856- 863.
  • 10Hsu W H. Genetic wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning[J]. Information Sciences, 2004, 163 (1 3) : 103-122.

共引文献12

同被引文献35

引证文献8

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部