期刊文献+

C4.5数据挖掘算法的改进

Improving Data Mining Algorithm C4.5
下载PDF
导出
摘要 介绍了一种一般情况下的C4.5数据挖掘算法的优化方法。原来的C4.5算法在计算属性信息增益率时需要大量用到对数运算,而优化后的C4.5算法计算属性信息增益率时只需用到加减乘除运算,在实现时不用频繁调用对数函数,优化后的算法不会改变属性信息增益率的排序,不改变生成的决策树。改进后的算法能做到在不改变准确率和不增加空间复杂度的情况下,减少时间复杂度,提高了决策树生成效率。 A kind of optimization method of data mining algorithm C4.5 that is applicable to the general case is introduced in this paper. The original algorithm C4.5 need to extensively use logarithmic operation while calculating the attribute information gain ratio, but the optimized algorithm C4.5 only uses adding, subtracting, multiplying and dividing operation when calculating the attribute information gain ratio. Thus, it does not need to frequently call logarithmic function when programming. The optimized algorithm doesn't change the attribute information gain ratio ranking and it doesn't change the generated decision tree. The optimized C4.5 algorithm can reduce the time complexity, improve the efficiency of the generation of decision tree without changing accuracy and increasing time complexity at the same time.
作者 谢秋华
出处 《三明学院学报》 2013年第2期21-26,共6页 Journal of Sanming University
基金 福建省自然科学基金项目(2012J1283) 福建省教育厅省属高校科研专项计划项目(JK2012051) 三明市科技局重点项目(2011-G-4)
关键词 数据挖掘 算法 优化 data mining algorithm optimizeation
  • 相关文献

参考文献5

  • 1TAN PANG NING,MICHAEL STEINBACH,VIPIN KUMAR.数据挖掘导论[M].2版.北京:人民邮电出版社,2011.
  • 2QUINLAN J R. C4.5 .. programs for machine learning [ M ] .San Mateo, .-Morgan Kaufmann, 1993.
  • 3LIM T S,LOH W Y, SHIH Y S.A comparison of prediction accuracy,complexity,and training time of thirty-three old and new classification algorithms [J ].Machine Learning.2000(40):203-229.
  • 4RUGGIERI S.Efficient C4.5 [J ].IEEE Transactions on Knowledge and data engineering, 2002,14(2):438-4 4.
  • 5陈秀琼.一种融合粗集理论和神经网络的分类数据挖掘算法[J].三明学院学报,2005,22(2):185-189. 被引量:2

二级参考文献14

  • 1Fayyad U M;Piatetsky-Shapiro G;Smyth P.Advances in Knowledge Discovery and Data Mining,1996.
  • 2JELONEK J;FRAWIEC K;SLOWINSKI R.Rough set reduction of attributes and their domains for neural networks,1995(02).
  • 3Z. Pawlak.Rough sets,1982(11).
  • 4Craven M W;Shavlik J W.Using neural networks for data mining[J],1997(2-3).
  • 5Setiono R;Liu H.Effective data mining using neural networks[J],1996(06).
  • 6Ziarko W.Introduction to the special issue on rough sets and knowledge discovery,1995(02).
  • 7Sarkar M;Yeqnanarayana B.Rough-fuzzy set theoretic approach to evaluate the importance of input features in cl assification[C],1997.
  • 8Ahn B S;Cho S S;Kim C Y.The integrated methodology of rough set theory and artificial neural network for business failure prediction,2000.
  • 9Pawlak Z;Grzymala-Busse J;Slowinski R.Roughsets,1995(11).
  • 10Chen M-S;Han J;Yu P S.Data mining:an overview from a database perspective[J],1996(06).

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部