期刊文献+

利用上凸函数对决策树算法的改进 被引量:2

An Improved Algorithm of Decision Trees by Using the Convex Function
下载PDF
导出
摘要 针对决策树分类方法的计算效率进行深入研究,根据信息增益计算的特点,引入了上凸函数的概念,用于提高决策树分类过程中信息增益的计算效率。利用我们所提出的"一致性定理"和"特殊一致性定理",从理论上证明了利用上凸函数对信息增益计算进行改进后,构造的决策树与原决策树具有相同的分类准确率。同时我们通过对大数据集的实验,发现在相同规模的数据集下,改进后的决策树算法比原算法有更高的计算效率,并且这种计算效率的提高有随着数据集规模的增加而增加的趋势。 In this paper,we research deeply the theory of decision trees induction.According to the character of expected information and the quality of convex function,we propose a new algorithm to raise the efficiency of calculating expected information in the process of inducing the decision trees.By using the theory of consistency and special consistency,we also prove that the accuracy of decision trees constructed by the improved algorithm is equal to the one of ID3 algorithm.At the same time,through the experiment of testing the large datasets,we find that the new algorithm has higher calculative efficiency than the old one in the same datasets.Moreover with the larger scale of datasets,the calculation of expected information has more rapid efficiency.
出处 《中国管理科学》 CSSCI 2004年第4期144-148,共5页 Chinese Journal of Management Science
关键词 决策树 ID3算法 上凸函数 信息熵 decision tress ID3 Algorithm convex function expected information
  • 相关文献

参考文献12

  • 1邵华,赵宏.一种与神经元网络杂交的决策树算法[J].小型微型计算机系统,2001,22(8):964-966. 被引量:8
  • 2肖勇,陈意云.用遗传算法构造决策树[J].计算机研究与发展,1998,35(1):49-52. 被引量:24
  • 3Carmela C,Francesco M,Roberta S.A Statistical Approach to Growing A Reliable Honest Tree[J].Computational Statistics and Data Analysisi,2002,38:285-299.
  • 4Bartlett P L,Mendelson S.Rademacher and Gaussian complexities:Risk Bounds and Structural Results[J].Journal of Machine Learning Research,2002,3:463-482.
  • 5Quinlan J R,Induction of Decision Tree[J].Machine Learning,1986,1(1):81-106.
  • 6Wang X Z,Chen B,Qian G L,et al.On the Optimization of Fuzzy Decision Trees[J].Fuzzy Sets and Systems,2000,112:117-125.
  • 7Chen M S,Yu P S,Liu B.A Method to Boost Naive Bayesian Classifiers[C].In:Proceedings of The Sixth Pacific-Asia Conference on Knowledge Discovery and Data Mining,2002:115-122.
  • 8Ling C X,Zhang H.Toward Bayesian Classifiers with Accurate Probabilities[C].In Proceedings of The Sixth Pacific-Asia Conference on Knowledge Discovery and Data Mining,2002:123-134.
  • 9Provost F,Domingos P.Tree Induction for Probability-Based Ranking[J].Machine Learning,September 2003,52(3):199-215.
  • 10Bredensteiner E J,Bennett K P.Feature Minimization within Decision Trees[J].Computational Optimizations and Applications,1998,10:111-126.

二级参考文献6

共引文献52

同被引文献26

  • 1栾丽华,吉根林.决策树分类技术研究[J].计算机工程,2004,30(9):94-96. 被引量:110
  • 2纪希禹.数据挖掘技术应用实例[M].北京:机械工业出版社,2009.
  • 3曾黄麟.粗集理论及其应用[M].重庆:重庆大学出版社,1996..
  • 4PRIGOGINE I. The networked society [ J ]. Journal of world - sys- tems research, 2015, 6 (3):892-898.
  • 5PRATTG A. Is a Cambrian explosion coming for robotics? [J]. The Journal of Economic Perspectives, 2015, 29 (3): 51 -60.
  • 6CHEN H L, YANG B, LIU J, et al. A support vector machine classi- fier with rough set - based feature selection for breast cancer diagnosis [ J ]. Expert Systems with Applications, 2011,38 (7) : 9014 - 9022.
  • 7BERETYA L, SANTANIELLO A. Implementing ReliefF filters to ex- tract meaningful features from genetic lifetime datasets [ J ]. Journal of biomedical informaties, 2011,44 (2) : 361 -369.
  • 8PAWLAK Z. Rough sets [ J]. International Journal of Computer and Information Sciences, 1982, 11 (5).
  • 9SON C S, KIM Y N, KIM H S, et al. Decision - making model for early diagnosis of congestive heart failure using rough set and decision tree approaches [ J ]. Journal of Biomedical Informatics, 2012, 45 (5) : 999 -1008.
  • 10SWINIARSKI R W, SKOWRON A. Rough set methods in feature se- lection and recognition [ J ]. Pattern Recognition Letters, 2003, 24 (6) : 833 -849.

引证文献2

二级引证文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部