期刊文献+

基于Map/Reduce的决策树分类挖掘方法应用研究 被引量:4

Application of Decision Tree Classification Method Based on Map/Reduce
下载PDF
导出
摘要 传统数据挖掘模式在处理海量、多维、复杂等特征的数据时,存在计算能力弱、效率低、可扩展性差等问题。论文提出基于Map/Reduce的决策树分类挖掘方法(C4.5BH算法),该算法采用K-means聚类方法对连续属性进行离散化,并利用Map/Reduce编程模型和属性表结构实现了决策树构造过程中属性的并行计算和节点的并行分裂。实验证明,与传统的C4.5算法相比,C4.5BH算法在处理大规模数据集时具有更高的执行效率和良好的加速比。 The traditional data mining model is weak in computing power, low efficiency and poor scalability when dealng with the data of massive, multi-dimensional and complex characteristics. This paper proposes a mining method (C4. 5BH lgorithm) based on Map/Reduce the decision tree classification, which uses the Kmeans clustering method to discretize the ontinuous attributes and the Map/Reduce programming model and attribute table structure to achieve the parallel computaion of the attributes and the parallel splitting of nodes in the process of constructing decision tree. Experiments show that 4. 5BH algorithm has a higher efficiency and a better speedup when dealing with large data sets, compared with the tradiional C4. 5 algorithm.
出处 《计算机与数字工程》 2016年第8期1504-1510,共7页 Computer & Digital Engineering
基金 国家科技支撑计划课题(编号:2015BAB07B01) 水利部公益性行业科研专项(编号:201501022)资助
关键词 Map/Reduce技术 K-MEANS算法 决策树 C4.5BH算法 technology of Map/Reduce, K-means algorithm, decision tree, C4. 5BH algorithm
  • 相关文献

参考文献9

二级参考文献97

共引文献1389

同被引文献25

引证文献4

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部