期刊文献+

一种基于混合集成方法的数据流概念漂移检测方法 被引量:9

Data Stream Concept Drift Detection Method Based on Mixture Ensemble Method
下载PDF
导出
摘要 近年来,数据流分类问题研究受到了普遍关注,而漂移检测是其中一个重要的研究问题。已有的分类模型有单一集成模型和混合模型,其漂移检测机制多基于理想的分布假设。单一模型集成可能导致分类误差扩大,噪音环境下分类效果受到了一定影响,而混合集成模型多存在分类精度和时间性能难以两者兼顾的问题。为此,基于简单的WE集成框架,构建了基于决策树和bayes混合模型的集成分类方法 WE-DTB,并利用典型的概念漂移检测机制Hoeffding Bounds和μ检验来进行数据流环境下概念漂移的检测和分类。大量实验表明,WE-DTB能够有效检测概念漂移且具有较好的分类精度及时空性能。 Mining with data stream concept drift is a hot topic in data mining.Existing classification approaches consist of ensemble method based on single base classifiers and ensemble method based on hybrid base classifiers,which depend on the stationary assumption and learnable assumption.However,the former probably causes the larger classification deviation and the performance on accuracy is impacted in the noisy data streams,while the latter performs worse on the classification accuracy or the time consumption.Motivated by this,an ensembling classification method WE-DTB was proposed,based on hybrid based models with decision trees and Naive Bayes.It is an extended framework of WE model.Meanwhile,we utilized the popular concept drift detection mechanisms based on Hoeffding Bounds and μ test to implement the detection on concept drifts.Extensive experiments demonstrate that our proposed method WE-DTB can detect concept drift effectively while maintaining the good performance on classification accuracy and consumptions on time and space.
出处 《计算机科学》 CSCD 北大核心 2012年第1期152-155,181,共5页 Computer Science
基金 国家自然科学基金课题(60975034) 安徽省自然科学基金课题(090412044)资助
关键词 数据流 概念漂移 分类 噪音 Data streams Concept drifts Classification Noise
  • 相关文献

参考文献12

  • 1Widmer G,Kubat M. Learning in the Presence of Concept Drift and Hidden Contexts [J]. Machine Learning, 1996, 23 (1): 69- 101.
  • 2Schlirnmer J C, Granger R H. Incremental Leaning from Noisy Data[J]. Machine Learning, 1986,1 (3) : 317-354.
  • 3Wang H X,Fan W,Yu P S,et al. Mining Concept-Drifting Data Streams Using Eensemble Classifiers [C]// Proc of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington.. ACM Press,2003.. 226-235.
  • 4Scholz M, Klinkenberg IL An Ensemble Classifier for Drifting Concepts[C] ffProe of 16th European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Porto, 2005.
  • 5Fan W. Systematic Data Selection to Mine Concept-Drifting Data Streams[C]//Proc of the 10th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining. Seattle, 2004:128-137.
  • 6Gao J,Fan W, Han J W. On Appropriate Assumptions to Mine Data Streams.. Analysis and Practice[C]// Proc of the 7th IEEE InternationaI Conference on Data Mining. Omaha, 2007:143-152.
  • 7Zhang P, Zhu X Q, Shi Y, et al. An Aggregate Ensemble for Mining Concept Drifting Data Streams with Noise[C]//Proc of the 13th Pacific-Asia Conference on Knowledge Discovery. Bangkok, 2009:1021-1029.
  • 8Li P P, Hu X G, Wu X D. Concept Drifting Detection on Noisy Streaming Data in Random Ensemble Decision Trees[C] //Proc of the 6th International Conference on Machine Learning and Data Mining. 2009 : 236-250.
  • 9Li Y, Zhang Y H, Hu X G. A Classification Algorithm for Noisy Data Streams [C]// Proc of 3rd International Conference of Fuzzy Systems and Knowledge Discovery (FSKD). Yantai, Chi- na, Springer, 2010: 2239-2244.
  • 10ACM Special Interest Group on Knowledge Discovery and Data Mining. KDI)CUP99 data set [ EB/OL]. http://kdd, ics. uci. edu/databases/kddcup99,1999.

共引文献3

同被引文献75

  • 1Pawlak Z. Rough sets-theoretical aspect of reasoning about data[ M ]. Dordrecht:Kluwer Academic Publishers, 1991.
  • 2Bazan G J. A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables [ C ]//Polkowski L, Skowron A. Rough sets in knowledge discovery 1 :Methodology and applications. Heidelberg: Physica-Verlag, 1998:321-365.
  • 3Bazan G J,Nguyen H S,Nguyen S H,et al. Rough set algorithms in classification problem[ C]//Polkowski L,Tsumoto S,Lin T Y. Rough set methods and applications. Heidelberg: Physica-Verlag, 2000:49-88.
  • 4Inuiguchi M, Miyajima T. Variable precision rough set approach to multiple decision tables [ C ]//Slzak D, Wang Guoyin, Szczuka M. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Heidelberg: Springer-Verlag,2005 : 304 -313.
  • 5Inuiguchi M. A Multi-agent rough set model toward group decision analysis[ J]. Kansei Engineering International Journal ,2006,6( 3 ) :33-40.
  • 6Inuiguchi M, Miyajima T. Rough set based rule induction from two decision tables [ J ]. European Journal of Operational Research, 2007,181 (3) :1540-1553.
  • 7Inuiguchi M. Three approaches to rule induction from multiple decision tables [ C ]//The Twelfth Czech Japan Seminar on Data Analysis and Decision Making under Uncertainty. Litomysl: Czech Republic, 2009:41-50.
  • 8]邓大勇,王基一.并行约简的现状与发展[J].中国人工智能学会通讯,201l,I(5):16.18.
  • 9Deng Dayong, Wang Jiyi, Li Xiangjun. Parallel reducts in a series of decision subsystems [ C ]//Proceedings of the Second International Joint Conference on Computational Sciences and Optimization. Sanya:City University of Hong Kong,2009:377-380.
  • 10Deng Dayong, Yan Dianxun, Wang Jiyi. Parallel reducts based on attribute significance [ C ]//Jian Yu, Salvatore G, Pawan L. Rough set and knowledge technology. Heidelberg: Springer-Verlag,2010:336-343.

引证文献9

二级引证文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部