摘要
在测试代价敏感决策系统中,测试代价敏感属性约简方法是一种寻找测试代价尽量小的属性集的有效方法.但是,约简后决策系统只保留了简洁完整的信息,其所构造的分类器精度会有所降低.假设我们拥有有限但多于最小测试代价的资源,那么我们可以充分利用这些资源来获得更高质量的分类器.本文针对这种情况做了以下两个工作:1)我们在最小测试代价约简的基础上添加好属性,寻找一个更好的属性集.2)提出了一种改进的决策树算法,提高分类器质量.该算法选择一些当前最好的属性值来构建结点,这些属性值能够覆盖当前相应的训练集.实验表明:1)改进的决策树算法能够获得比ID3更高的分类准确度;2)与最小测试代价约简的分类器相比,在最小测试代价约简的基础上添加一些的好属性,可以获得更高质量的分类器;3)该方法在减少测试代价开销的同时,保证了分类器的质量.
In test cost-sensitive decision systems, test cost-sensitive attribute reduction is a good method to find an attribute set of which test cost is as low as possible. However, the minimal test cost-sensitive attribute reduct can only keep the simplest information of the decision system. As a result, the classification accuracy will be reduced. Suppose we have limited cost more than the minimal test cost, we can make good use of these cost to improve the classification accuracy. In this paper, our work includes two aspects. 1) We make good use of limited cost to select addition important attributes based on the minimal test cost-sensitive attribute reduct. 2) We improve the decision trees to build a quality classifier. We construct the node of the decision tree by several current best attribute values. These values can just cover the current dataset. Experimental results indicate 1) the improved decision tree gets higher accuracy than ID3, 2) selecting addition important attribute based on the minimal test cost reduct can get higher quality of classifiers than that get by the minimal test cost reduct, and 3) this approach can decrease the test cost, and keep the quality of the classification at the same time.
出处
《漳州师范学院学报(自然科学版)》
2013年第1期24-28,共5页
Journal of ZhangZhou Teachers College(Natural Science)
基金
国家自然科学基金项目(61170129)
关键词
数据挖掘
代价敏感学习
代价限制
分类
date Mining
cost-sensitive learning
limited cost
classification