期刊文献+

一种改进的决策树后剪枝算法 被引量:17

An Improved Post-Pruning Algorithm for Decision Tree
下载PDF
导出
摘要 当深度和节点个数超过一定规模后,决策树对未知实例的分类准确率会随着规模的增大而逐渐降低,需要在保证分类正确率的前提下,用剪枝算法对减小决策树的规模。论文在对现有决策树剪枝算法优缺点进行分析的基础上,提出了一种综合考虑分类精度、分类稳定性以及决策树规模的后剪枝改进算法,并通过实验证明了该算法在保证模型判别精度和稳定性的前提下,可以有效地减小了决策树的规模,使得最终的自动判别模型更加简洁。 The classification accuracy of a decision tree would be lower when the depth and the nodes exceed a certain size.So it's necessary to reduce the scale of decision tree by using apruning algorithm and ensure the accuracy of classification at the same time.To solve this problem,a kind of post-pruning strategy which evenly considers classification accuracy,classification stability,and the scale of decision tree is proposed on the basis of in-depth study of the existing decision tree pruning algorithm.Experimental results show that this improved post-pruning algorithm can effectively reduce the size of the decision tree,ensure the accuracy and stability,and make the final model more compact.
作者 郑伟 马楠
出处 《计算机与数字工程》 2015年第6期960-966,971,共8页 Computer & Digital Engineering
关键词 分类算法 决策树 剪枝算法 classification algorithm, decision tree, pruning algorithm
  • 相关文献

参考文献5

  • 1栾丽华,吉根林.决策树分类技术研究[J].计算机工程,2004,30(9):94-96. 被引量:112
  • 2Vapnik V. The nature of statistical learning theory [M]. Berlin: Springer, 1999:22-31.
  • 3约翰洛西.科学哲学历史导论[M].武汉:华中工学院出版社,1982:117-130.
  • 4李道国,苗夺谦,俞冰.决策树剪枝算法的研究与改进[J].计算机工程,2005,31(8):19-21. 被引量:30
  • 5Quinlan J R. Simplifying decision trees[J]. Interna- tional journal of man-machine studies, 1987,27(3) : 221- 234.

二级参考文献8

  • 1Quinlan J R. Simplifying Decision Trees.International Journal of Man-machine Studies,1987,27: 221-234.
  • 2Quinlan J R. Induction of Decision Trees. Machine Learning,1986,181:106.
  • 3Han J, Kambr M. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2001:279-333
  • 4Ruggieri S. Efficient C4.5. IEEE Transactions on Knowledge and Data Engineering, 2002, 14(2):438-444
  • 5Breiman L, Friedman JH, Olshen RA, et al. Classification and Regression Trees. Chapman & Hall(Wadsworth, Inc.): New York, 1984
  • 6Mehta M, Agrawal R, Rissancn J. SLIQ: A Fast Scalable Classifier for Data Mining. Research Report, IBM Almaden Research Center, San Jose, California, 1995
  • 7Shafer J, Agrawal R, Mehta M. SPRINT: A Scalable Parallel Classifier for Data Mining. Research Report, IBM Almaden Research Center,San Jose, California, 1996
  • 8Rastogi R, Shim K. PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning. Technical Report, Bell Laboratories, Murray Hill, 1998

共引文献140

同被引文献160

引证文献17

二级引证文献69

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部