摘要
随机森林最早是由Breiman提出的,是机器学习的算法之一。本文以一个回归,一个分类的数据为基础,利用10折交叉验证的方法比较传统经典回归和分类方法与随机森林的预测效果。对于回归数据,分别用逐步回归、岭回归、偏最小二乘回归、线性回归和随机森林做预测对比,10折交叉验证结果显示随机森林的预测效果比传统回归方法的预测效果好。对于分类数据,分别用混合线性判别分析、线性判别分析、logistic回归和随机森林进行分类对比,10折交叉验证结果显示随机森林的分类效果比传统分类方法的预测效果好。
Random Forest was first proposed by Breiman as one of the algorithms for machine learning. Based on one regression and one categorical data, this paper uses the 10-fold cross-validation method to compare the prediction effect of traditional classical regression and classification methods with random forests. For the regression data, stepwise regression, ridge regression, partial least squares regression, linear regression and random forest were used for prediction comparison, and the 10-fold cross-validation results showed that the prediction effect of random forest was better than that of traditional regression method. For the categorical data, mixed linear discriminant analysis, linear discriminant analysis, logistic regression and random forest were used for classification comparison, and the results of 10-fold cross-validation showed that the classification effect of random forest was better than that of the traditional classification method.
出处
《统计学与应用》
2023年第2期255-260,共6页
Statistical and Application