期刊文献+

基于分类规则的C4.5决策树改进算法 被引量:22

Improved C4.5decision tree algorithm based on classification rules
下载PDF
导出
摘要 为解决大样本数据条件下C4.5决策树算法需要训练集常驻内存、分类精度达不到需求以及如何选取最优分类规则等问题,提出了一种基于分类规则选取的C4.5决策树改进算法。通过数次有放回的随机抽取训练集形成多个分类规则,在多次分类规则内寻找特征的最优取值以建立最优分类规则,以划分相似度为标准进行C4.5决策树最优特征选取,在此基础上利用选定的最优分类规则和最优特征对C4.5决策树算法进行改进。实验结果表明,改进后的算法可有效解决C4.5决策树与初始训练集相关性较大的问题,对大样本数据集的分类识别在识别率上有显著提高,训练时间明显减少。 Under the condition of large sample data set of memory-resident, classification accuracyneed to meet the demand, and how to select the optimal classification rules, the improved CA. 5 decision tree algorithm based on classification rules selecting is put forward. The algorithm forms a plurality of classification rules through several times back in the random training set. By several classification rules, the optimal value is found in order to establish the optimal classification rules, and use partition similarity as standard to select C4.5 decision tree optimal feature. Based on the use of optimal classification rules and selected optimal feature, CA. 5 decision tree algorithm is improved. The experiments show that the improved algorithm can effectively solve the problem that C4.5 decision tree is large correlated with initial training set, classification rate of large sample data sets is significantly increased. The training time is significantly reduced.
出处 《计算机工程与设计》 CSCD 北大核心 2013年第12期4321-4325,4330,共6页 Computer Engineering and Design
基金 国家863高技术研究发展计划基金项目(2011AA010603 2011AA010605)
关键词 C4 5决策树 分类规则 属性度量 划分相似度 特征选取 C4.5 decision tree classification rules attribute measures partition similarity feature selection
  • 相关文献

参考文献6

二级参考文献97

共引文献167

同被引文献180

引证文献22

二级引证文献122

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部