摘要
决策树是数据挖掘的一种重要方法,通常用来形成分类器和预测模型。ID3算法作为决策树的核心算法,由于它的简单与高效而得到了广泛的应用,然而它倾向于选择属性值较多的属性作为分支属性,从而可能错过分类能力强的属性。对ID3算法的分支策略进行改进,增加了对属性的类区分度的考量。经实验比较,新方法能提高决策树的精度,简化决策树。
Desion tree,which is used to classify samples, is one of the important models in data mining. As the core algorithm of decision tree,the classical ID3 algorithm is being widely used in classification problems by its simplicity and efficiency. Unfortunately, it is prone to making the attribute which contains more values as decision attribute, so the attribute which has strong classification ability are probably missed. Proposes an improved ID3 algorithm. When the most information gains of the attributions are same, the algorithm helps us to select an attribute which can get better classification effect. Compared to the classical ID3 algorithm, the new one can reduce misclassification rate and simplify the complexity of the decision tree.
出处
《现代计算机》
2009年第5期43-46,共4页
Modern Computer
关键词
决策树
属性
属性的类区分度
Decision Tree
Attribute
Full Division Class Number