摘要
决策树分类方法是一种有效的数据挖掘分类方法。单变量决策树结构简单,但规模较大。多变量决策树是为了进一步缩减树的规模而提出的决策树结构,通过选取属性的合理组合作为分裂属性,可使树的规模相对较小。文章在对以往所提出的混合变量决策树算法RSH2的抗噪性差和属性被多次选取等问题进行改进的基础上,提出了基于粗糙集的多变量决策树算法VPMDT。通过与ID3、HACRs、RSH2和C4.5等算法进行的实验比较表明,VPMDT有较好的时空性能,并保持较高的分类预测正确率。
The decision tree is an effective model in classification. The structure of univariate decision trees is simple while the magnitude is very large. However, multivariate decision trees can reduce the sizes of trees and maintain high prediction accuracy using the reasonable combination of several attributes as the split attributes properly. In this paper, an advanced multivariate decision tree algorithm named VPMDT(variable precision multivariate decision tree) is proposed based on the rough set theory to deal with the weaknesses of noise handling and attributes' multi-selecting. Extensive studies demonstrate that in comparison with state-of-the-art algorithms of ID3, HACRs, RSH2 and CA. 5, the VPMDT algorithm has better performance in the overheads of runtime and space as well as the prediction accuracy.
出处
《合肥工业大学学报(自然科学版)》
CAS
CSCD
北大核心
2009年第12期1833-1838,共6页
Journal of Hefei University of Technology:Natural Science
基金
国家自然科学基金资助项目(60975034)
安徽省自然科学基金资助项目(090412044)
关键词
决策树
多变量
粗糙集合
decision tree
multivariable
rough set