摘要
糖尿病已经成为威胁人类健康的慢性病之一.实现对糖尿病的早期预测,有助于辅助医疗决策.针对糖尿病数据普遍存在的维度过高,缺失值较多等特点,为了提高预测精度,从集成学习入手,提出一种基于XGBoost算法的糖尿病预测模型.该模型以CART回归树作为基学习器,利用收集到的真实数据对模型进行训练和测试,并调整XGBoost的主要参数,最终实现了血糖值的回归预测.实验结果表明,该模型平均绝对百分比误差下降到8.57%,比本文对比的基于SVM、随机森林的预测模型精度更高,且运行速度快,稳定性强.
Diabetes has become one of the chronic diseases threatening human health.The realization of early prediction of diabetes is helpful to assist medical decision-making.In order to improve the prediction accuracy,where are generally many feature dimensions,more missing values,a new diabetes prediction model based on XGBoost algorithm from ensemble learning is proposed in this paper.The model adopts CART regression tree as the base learner,uses the collected real data to train and test the model,and adjusts the main parameters of XGBoost.Finally,the regression prediction of blood glucose was achieved.Through the experimental results,the MAPE of the XGBoost algorithm drops to 8.57%,which is more accurate than the predicted value based on SVM,Random forest.
作者
曲文龙
李一漪
周磊
QU Wen-long;LI Yi-yi;ZHOU Lei(College of Information Engineering,Hebei GEO University,Shijiazhuang 050031,China;College of Materials and Engineering,Southwest Petroleum University,Chengdu 610500,China)
出处
《吉林师范大学学报(自然科学版)》
2019年第4期118-125,共8页
Journal of Jilin Normal University:Natural Science Edition
基金
河北省自然科学基金项目(F2016403055)
河北省重点研发计划项目-高新技术产业技术开发专项项目(18212005)。