期刊文献+

Exploiting Empirical Variance for Data Stream Classification

Exploiting Empirical Variance for Data Stream Classification
原文传递
导出
摘要 Classification,using the decision tree algorithm,is a widely studied problem in data streams.The challenge is when to split a decision node into multiple leaves.Concentration inequalities,that exploit variance information such as Bernstein's and Bennett's inequalities,are often substantially strict as compared with Hoeffding's bound which disregards variance.Many machine learning algorithms for stream classification such as very fast decision tree(VFDT) learner,AdaBoost and support vector machines(SVMs),use the Hoeffding's bound as a performance guarantee.In this paper,we propose a new algorithm based on the recently proposed empirical Bernstein's bound to achieve a better probabilistic bound on the accuracy of the decision tree.Experimental results on four synthetic and two real world data sets demonstrate the performance gain of our proposed technique. Classification, using the decision tree algorithm, is a widely studied problem in data streams. The challenge is when to split a decision node into multiple leaves. Concentration inequalities, that exploit vari- ance information such as Bernstein's and Bennett's inequalities, are often substantially strict as compared with Hoeffding's bound which disregards variance. Many machine learning algorithms for stream classification such as very fast decision tree (VFDT) learner, AdaBoost and support vector machines (SVMs), use the Hoeffding's bound as a performance guarantee. In this paper, we propose a new algorithm based on the recently proposed empirical Bernstein's bound to achieve a better probabilistic bound on the accuracy of the decision tree. Experi- mental results on four synthetic and two real world data sets demonstrate the performance gain of our proposed technique.
出处 《Journal of Shanghai Jiaotong university(Science)》 EI 2012年第2期245-250,共6页 上海交通大学学报(英文版)
基金 the National Natural Science Foundation of China(Nos.60873108,61175047 and 61152001) the Fundamental Research Funds for the Central Universities of China(No.SWJTU11ZT08)
关键词 Hoeffding and Bernstein’s bounds data stream classification decision tree anytime algorithm Hoeffding and Bernstein's bounds, data stream classification, decision tree, anytime algorithm
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部