期刊文献+

一种基于纵向划分数据集的并行决策树分类算法 被引量:2

A Parallel Decision Tree Classification Algorithm Based on Vertical Data Partitioning
下载PDF
导出
摘要 本文提出了一种处理多属性数据集的快速可扩展性并行分类算法—FSPC算法。它首次采用了纵向划分数据集以及在测试属性的选择过程中同步划分数据集等方法。实验结果表明 ,它不仅有利于减少通信及进行I/O的开销 ,而且有利于提高算法的并行度。 We present a fast scalable parallel classification algorithm in this paper named FSPC to handle large databases with lots of attributes.It is the first algorithm to introduce several kinds of techniques such as partitioning databases vertically,and performing the split while finding split points.Experimental results show that these techniques can not only reduce communication and I/O costs,but also increase the algorithm parallelism.
出处 《计算机工程与科学》 CSCD 2004年第7期67-70,共4页 Computer Engineering & Science
基金 上海市科学技术发展基金资助项目 ( 0 1J14 0 2 2 ) 上海市教委"第四期重点学科"项目 ( 2 0 5 15 3 )
关键词 数据挖掘 数据仓库 数据集 并行分类算法 FSPC算法 决策树 数据库 data mining parallel processing scalability decision tree classification
  • 相关文献

参考文献5

  • 1[1]Jiawei Han, Micheline Kamber. Data Mining: Concepts and Techniques[M]. Morgan Kaufmann Publishers, 2000.
  • 2[2]John Shafer, Rakesh Agrawal, Manish Mehta. SPRINT: A Scalable Parallel Classifier for Data Mining[A]. Proc of 22nd Int'l Conf on Very Large Databases [C]. 1996.
  • 3[3]M V Hoshi, G Karypis, V Kumar. ScalParC:A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets[A]. Proc of the Int'l Parallel Processing Symp[C]. 1998.
  • 4[4]Manish Mehta, Rakesh Agrawal, Jorma Rissanen. SLIQ: A Fast Scalable Classifier for Data Mining[A]. Proc of the 5th Int'l Conf on Extending Database Technology(EDBT)[C]. 1996.
  • 5[5]NASA Ames Research Cener. Introduction to IND Version 2.1.GA23-2475-02 ed[Z]. 1992.

同被引文献17

  • 1魏红宁.基于SPRINT方法的并行决策树分类研究[J].计算机应用,2005,25(1):39-41. 被引量:18
  • 2张宇亮,张立臣,李代平.并行算法的任务粒度与映射方法的分析[J].计算机工程与应用,2005,41(20):44-47. 被引量:3
  • 3刘键,谢卫,朱晓梅,谷秋艳.一种关于DO-loop并行划分的新观点与新方法[J].计算机学报,1996,19(7):520-529. 被引量:1
  • 4LinC,SynderL.并行程序设计原理[M].陆鑫达,林新华,译.北京:机械工业出版社,2009:101-195.
  • 5吕爽,陈高云,吴晓,王鹏.基于主从模式的并行决策树算法研究[J].西南民族大学学报(自然科学版),2007,33(4):743-745. 被引量:1
  • 6Quinlan J R. Introduction of decision trees [ J ]. Machine Le- arning,1986( 1 ) :81-106.
  • 7Kotsiantis S B. Decision trees:a recent overview [ J ]. Artificial Intelligence Review,2013,39(4) :261-283.
  • 8Sug H. A comprehensively sized decision tree generation me- thod for interactive data mining of very large databases [ C ]// Advanced data mining and applications. Berlin: Springer, 2005 : 141 - 148.
  • 9Gill A, Smith G D, Bagnall A J. Improving decision tree per- formance through induction-and cluster-based stratified sam- piing [ C ]//Proc of IDEAL 2004. Berlin : Springer, 2004 : 339- 344.
  • 10Fu L. Construction of decision trees using data cube[M]// Enterprise information systems VII. Netherlands: Springer, 2006:87 -94.

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部