期刊文献+

基于分布式运算的决策树算法的研究与实现

Research and Implementation of Decision Tree Algorithm Based on Distributed Computing
下载PDF
导出
摘要 摒弃决策树算法中一般常用的递归运算,利用网络中的任意台计算机,采用C/S结构模式,由服务器主要存储决策树的节点信息,客户机并行计算出决策树的分割属性及其分支,以解决计算过程中的内在崩溃现象。实践表明,C/S结构模式可建立于局域网、广域网,且可充分利用云计算平台上的高性能虚拟网络,具有结构简单、灵活,适应面广等特点,在相同样本数据集上,测试正确率较高,算法切实可行。 The paper abandons the recursive algorithm commonly used in decision tree algorithm. Using the C / S structure model in the network, the server mainly stores the node information of the decision tree, and the client calculates the segmentation attribute of the decision tree and its branches in parallel to solve the inherent collapse phenomenon in the calculation process. The practice shows that the C / S structure model can be built on the local area network and the wide area network, and can make full use of the high performance virtual network on the cloud computing platform. It has the characteristics of simple structure, flexibility and wide adaptability. On the same sample data set, the test accuracy is high and the algorithm is feasible.
作者 沈建涛 SHEN Jian-tao(School of Electronic Information Engineering, Nantong Vocational University, Nantong 226007, Chin)
出处 《南通职业大学学报》 2017年第1期74-77,共4页 Journal of Nantong Vocational University
基金 2016年度江苏省现代教育技术研究课题(2016-R-49099)
关键词 决策树 数据挖掘 分布式运算 大数据 云计算 decision tree data mining distributed computing big data cloud computing
  • 相关文献

参考文献3

二级参考文献55

  • 1魏红宁.基于SPRINT方法的并行决策树分类研究[J].计算机应用,2005,25(1):39-41. 被引量:18
  • 2John Durkin,蔡竞峰,蔡自兴.决策树技术及其当前研究方向[J].控制工程,2005,12(1):15-18. 被引量:63
  • 3郭玉滨.一种基于离散度的决策树改进算法[J].山东师范大学学报(自然科学版),2006,21(3):129-131. 被引量:3
  • 4宋晓云,苏宏升.一种并行决策树学习方法研究[J].现代电子技术,2007,30(2):141-144. 被引量:4
  • 5SHAFER J,AGRAWAL R,MEHTA M. SPRINT:a scalable parallel classifier for data mining[C]//Proeessing of the 22th International Conference on VLDB,Bombay,India. San Frasisco :Morgan Kaufmann Publishers, 1996:544-555.
  • 6DEAN J,GHEMAWAT S. MapReduce :simplified data processing on large clusters[J]. Communications of the ACM, 2008,51 (1) : 107-113.
  • 7AGRAWAL R ,IMIELINSKI T,SWAI A. Database mining:a performance perspective [J]. IEEE Transaction on Knowledge and Data Engineering, 1993,5 (6) : 914-925.
  • 8Han Jiawei,Kamber Micheline,Pei Jian.Data mining:conceptsand techniques:concepts and techniques[M].Elsevier,2011.
  • 9Apache mahout[EB/OL].[2015-09-201.http://mahout.apache.org/.
  • 10Cloud computing[EB/OL].[2015-09-201.https://en.wikipedia.org/wiki/Cloud_computing.

共引文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部