摘要
决策树分类方法是一种非常有效的机器学习方法,具有分类精度高、对噪声数据有很好的健壮性以及形成树状模式等优点,对决策树算法的优化也主要是从分支属性的选择标准,对决策树的修剪,以及引入模糊理论、粗糙集理论、遗传算法和神经网络算法等几个方面进行优化。引入粗糙集理论中的属性重要性原理来对决策树进行优化,首先计算出每个条件属性对分类的重要度,然后根据重要度大小来对样本集进行一个筛选,在不损害分类准确率的同时减小决策树的规模。整个算法在Visual C++6.0环境下编程实现,并应用于热轧工艺模型中,通过对热轧数据的处理,验证了算法的有效性。
Decision tree classification method is a very effective machine learning methods,with a classification of high precision,good noise robustness of the data and the formation of the advantages of a tree model.The optimization of decision tree algorithms are mainly from the choice of the branch properties standards,decision tree pruning,and the introduction of fuzzy theory,rough set theory,genetic algorithm and neural network algorithms to optimize several aspects.This article introduces the properties of rough set theory,the importance of the principle to optimize the decision tree,first calculated for each condition attribute importance to classification,and then importance sample set size of a filter,without prejudice to the classification accuracy rate while reducing the size of tree.The algorithm in Visual C+ +6.0 programming environment,and is applied to hot rolling model,data processing by hot rolling to verify the validity of the algorithm.
出处
《信息技术》
2011年第10期222-224,227,共4页
Information Technology
关键词
决策树分类
粗糙集
属性重要性
热轧工艺模型
decision tree classification
rough set
attribute importance
hot rolling process model