摘要
针对样本中有无关的、冗余的属性会降低决策树算法的分类精度,本文提出基于一致性度量属性约简后构建决策树的方法。对UCI机器学习数据库中5个两类分类样本离散化后,分别基于粗糙集和一致性度量的属性约简来构建C45和CART决策树,实验表明基于一致性度量属性约简构建的决策树有较高的精度和可行性。
Aming at irrelevant and redundant attributes could decrease the classification accuracy of decision tree,this paper proposes a method that building decision tree based on the reduction that chosen base on the consistency criterion.After the process of discretization for continuous attributes for the 5 two-class samples from UCI machine learning repository,constructs C45 and CART decision trees based on rough set theory and consistency criterion respectively.The experiment based these 5 samples shows the method based on consistency criterion is efficient and feasible for decision tree building.
出处
《计算机与现代化》
2011年第9期181-184,共4页
Computer and Modernization
关键词
粗糙集
属性约简
决策树
一致性
rough set
attributes reduction
decision tree
consistency criterion