摘要
本文以2022年成都的二手房房价数据为研究对象,构建随机森林模型和XGBoost模型来预测二手房价格。首先对数据集进行清洗并可视化处理,构建虚拟变量,接着绘制热力图并运用熵值法进行特征值筛选,选取重要的特征进行训练模型。接着,采用网格搜索技术分别开发了基于随机森林和XGBoost的预测模型,并利用决定系数、均方误差和平均绝对误差这三个关键指标来衡量模型的预测准确性,经过模型比较和结果分析,发现优化后的XGBoost模型对二手房房价有良好的预测结果,准确率达90.3%。This article takes the second-hand housing price data of Chengdu in 2022 as the research object, and constructs a random forest model and XGBoost model to predict the second-hand housing price. Firstly, the dataset is cleaned and visualized to construct virtual variables. Then, a heat map is drawn and the entropy method is used for feature value screening to select important features for training the model. Subsequently, prediction systems based on random forest and XGBoost were developed using grid search techniques, and the accuracy of the models was measured using three key indicators: coefficient of determination, mean square error, and mean absolute error. After model comparison and result analysis, it was found that the optimized XGBoost model had good prediction results for second-hand housing prices, with an accuracy rate of 90.3%.
出处
《应用数学进展》
2024年第9期4417-4428,共12页
Advances in Applied Mathematics