期刊文献+

随机森林与传统经典方法在回归与分类问题中的比较

Comparison of Random Forest and Traditional Classical Method in Regression and Classification Problems
下载PDF
导出
摘要 随机森林最早是由Breiman提出的,是机器学习的算法之一。本文以一个回归,一个分类的数据为基础,利用10折交叉验证的方法比较传统经典回归和分类方法与随机森林的预测效果。对于回归数据,分别用逐步回归、岭回归、偏最小二乘回归、线性回归和随机森林做预测对比,10折交叉验证结果显示随机森林的预测效果比传统回归方法的预测效果好。对于分类数据,分别用混合线性判别分析、线性判别分析、logistic回归和随机森林进行分类对比,10折交叉验证结果显示随机森林的分类效果比传统分类方法的预测效果好。 Random Forest was first proposed by Breiman as one of the algorithms for machine learning. Based on one regression and one categorical data, this paper uses the 10-fold cross-validation method to compare the prediction effect of traditional classical regression and classification methods with random forests. For the regression data, stepwise regression, ridge regression, partial least squares regression, linear regression and random forest were used for prediction comparison, and the 10-fold cross-validation results showed that the prediction effect of random forest was better than that of traditional regression method. For the categorical data, mixed linear discriminant analysis, linear discriminant analysis, logistic regression and random forest were used for classification comparison, and the results of 10-fold cross-validation showed that the classification effect of random forest was better than that of the traditional classification method.
作者 董娅婷
出处 《统计学与应用》 2023年第2期255-260,共6页 Statistical and Application
  • 相关文献

参考文献2

二级参考文献23

  • 1Archer KJ, Kirnes RV, 2008. Empirical characterization of random forest variable importance measures. Comput. Stat. Data Anal. ,52(4):2249-2260.
  • 2Biau G, 2012. Analysis of a random forests model. J. Mach. Learn. Res. , 13: 1063 -1095.
  • 3Breiman L, 2001a. Random forests. Mach. Learn. , 45:5 - 32.
  • 4Breiman L, 2001b. Statistical modeling: The two cultures. Stat. Sci., 16:199-215.
  • 5Breiman L, Friedman JH, O lshen RA, Stone CJ, 1984.Classification and Regression Trees. Chapman and Hall. 1 -359.
  • 6Cutler DR, Edwards TC, Jr., Beard KH, Cutler A, Hess KT, 2007. Random forests for classification in ecology. Ecology, 88 (11) :2783 - 2792.
  • 7Deng H, Runger G, Tuv E, 2011. Bias of importance measures for multi-valued attributes and solutionsl I Proceedings of the 21 st International Conference on Artificial Neural Networks (ICANN).
  • 8Elith J, Graham CH, 2009. Do they? How do they? Why do they differ? On finding reasons for differing performances of species distribution models. Ecography, 32 ( 1 ) : 66 - 77 .
  • 9Genuer R, Poggi JM, Tuleau-Malot C, 2010. Variable selection using random forests. Pattern Recogn. Lett., 31 (14) :2225 - 2236.
  • 10Groemping U, 2009. Variable importance assessment in regression.: linear regression versus random forest. Am. Stat. , 63(4) :308 -319.

共引文献351

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部