摘要
首先分析了基于正区域、基于粗糙边界和基于依赖度的属性选择标准的关系,并证明了这三种属性选择标准彼此等价。然后以正区城的属性选择标准为代表,分析了基于正区域的决策树生成算法的优点和不足。针对这些不足。给出了一种新的属性选择标准,即基于伴随正区域的属性选择标准。用新的属性选择标准生成的决策树一般具有叶子数目较少,叶子的平均深度也较小,且叶子具有较强的泛化能力。最后,用一实例说明了新的属性选择标准的优越性。
In this paper, the relationship of selected attribute standards between based on positive region, based on rough bound and based on attribute dependency is firstly analyzed. At the same time, it is proved that the three kinds of selected attribute standards are equivalent to each other. And then advantages and disadvantages of algorithm for constructing decision tree based on positive region are analyzed. Meanwhile, aiming at these disadvantages, a new selected attribute standard based on adjoint positive region is proposed. The decision tree constructed with the new standard of attribute selection has the following characteristics: fewer leaf nodes, fewer levels of average depth, better generalization of leaf nodes. Finally an example is used to illustrate the advantages of this new selected attribute standard.
出处
《计算机科学》
CSCD
北大核心
2008年第5期138-142,共5页
Computer Science
基金
国家自然科学基金重点项目(69835001)资助
教育部科技重点项目([2000]175)资助
北京市自然科学基金项目(4022008)资助
关键词
决策树
粗糙集
正区域
粗糙边界
依赖度
伴随正区域
Decision tree, Rough set, Positive region, Rough bound, Attribute dependency, Adjoint positive region