期刊文献+

SPRINT决策树方法中I/O分析及优化研究

Research on the Analysis and Optimization of Sprint I/O
下载PDF
导出
摘要 分析SPRINT方法的磁盘I/O,提出用读优化、写优化和磁盘文件搜索优化来减少SPRINT方法的磁盘I/O时间。读优化可使SPRINT方法省去一次读操作,写优化可以使SPRINT方法在交替层省去一次写操作,磁盘文件搜索优化可使SPRINT方法的磁盘文件搜索时间复杂性只和决策树的节点个数相关。这三种方法可单独使用,也可结合起来使用。 Based on the detailed analysis of sprint I/O requirements, three new I/O optimizations were introduced, which were I/O read optimization, I/O write optimization and disk seek optimization. With I/O read optimization, each node in sprint decision tree only require a single read. With I/O write optimization, it is possible to skip writes at every alternate level of the tree. With the disk seek optimization, and only O(n) seeks are required for processing n leaf nodes at a level. These three methods can work with sprint separately,or work together.
出处 《计算机与数字工程》 2007年第6期49-51,54,共4页 Computer & Digital Engineering
关键词 SPRINT 决策树 磁盘I/O 优化 SPRINT,decision tree,disk I/O,optimization
  • 相关文献

参考文献5

  • 1Chan P.K.,Stolfo S.J.,Metalearning for multistrategy learning and parallel learning,Proc.Second Intl.Conference on Multistrategy learning,George Mason University,Fairfax,VA,1993; 150 ~ 165
  • 2Quinlan J.R.,C4.5:Programs for machine learning,Morgan Kaufman,San Mateo,CA,1993:1 ~ 302
  • 3Shafer J.,Agrawal R.,Mehta M.,SPRINT:A scalable parallel classifier for data mining,Proc.Of the 22nd VLDB Conference,Mumbai,India,1996:544 ~ 555
  • 4魏红宁.基于SPRINT方法的并行决策树分类研究[J].计算机应用,2005,25(1):39-41. 被引量:18
  • 5Joshi M.V.,Karypis G.,Kumar V.,ScalParC:A new scalable and efficient parallel classification algorithm for mining large datasets,Proc.Of the International Parallel Processing Symposim,Orlando,Florida,USA,1998:573 ~ 579

二级参考文献10

  • 1HAN EH, SRIVASTAVA A, KUMAR V. Parallel formulation of inductive classification learning algorithm[ R]. Minneapolis, USA: University of Minnesota, 1996.
  • 2QUINLAN R. C4.5: Programs for Machine Learning[ M]. San Mateo, CA: Morgan Kaufmann, 1993.
  • 3MEHTA M, AGRAWAL R, RISSANEN J. SLIQ: A Fast Scalable Classifier for Data Mining[ A]. Proceedings of EDBT-96[ C]. Berlin, Germany: Springer Verlag, 1996.18 -32.
  • 4SHAFER J, AGRAWAL R, MEHTA M. SPRINT: A Scalable Parallel Classifier for Data Mining[ A]. Proceedings Of the 22nd International Conference on Very Large Databases[ C]. San Mateo, CA:Morgan Kauffman, 1996. 544 - 555.
  • 5SRIVASTAVA A, HAN EH, KUMAR V, et al.. Parallel Formulations of Decision Tree Classification Algorithms[ J]. Data Mining and Knowledge Discovery, 1999, 3(3) : 237 - 261.
  • 6DEWITT D, NAUGHTON J, SCHNEIDER D. Parallel Sorting on a Shared-nothing Architecture using Probabilistic Splitting[ A]. Proceedings First international Conference on Parallel and Distributed Information Systems[C]. New York: IEEE Press, 1992.280-291.
  • 7AGRAWAL R, IMIELINSKI T, SWAMI A. Database Mining: A Performance Perspective[ J]. IEEE Transaction on Knowledge and Data Engineering, 1993, 5(6) : 914 -925.
  • 8MEHTA M, ,AGRAWAL R, RISSANEN J. SLIQ: A Fast Scalable Classifier for Data Mining[ A]. Proceedings of EDBT-96[ C]. Berlin, Germany: Springer Verlag, 1996.18 -32.
  • 9SRIVASTAVA A, HAN EH, KUMAR V, et al.Parallel Formulations of Decision Tree Classification Algorithms[ J]. Data Mining and Knowledge Discovery, 1999, 3(3) : 237 - 261.
  • 10AGRAWAL R, IMIELINSKI T, SWAMI A. Database Mining: APerformance Perspective[J]. IEEE Transaction on Knowledge and Data Engineering, 1993, 5(6) : 914 -925.

共引文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部