期刊文献+

机器学习算法在体检人群糖尿病风险预测中的应用 被引量:14

Application of machine learning algorithm in diabetes risk prediction of physical examination population
原文传递
导出
摘要 目的探索Logistic回归分析模型和LightGBM(light gradient boosting machine)算法对体检人群未来罹患糖尿病的预测效果及影响因素。方法选取2003年8月―2019年4月在南方医院健康管理中心多次进行团体参检的36292例非糖尿病人员,分层随机选取70%样本,以首次体检的性别、年龄、BMI、腰围、心率、收缩压、舒张压、空腹血糖等34项指标作为自变量,以相对首次体检时间的5年内是否罹患糖尿病为因变量,基于Logistic回归分析模型和LightGBM算法分别建立糖尿病预测模型。将预测模型应用于剩余30%样本,并使用受试者工作特征(receiver operating characteristic,ROC)曲线下面积(area under curve,AUC)进行预测效果的评价。结果Logistic回归分析模型和LightGBM算法模型的AUC分别为0.906和0.910,在最佳临界点上,Logistic回归分析模型的灵敏度和特异度分别为81.5%和84.3%,LightGBM(light gradient boosting machine)算法模型的灵敏度和特异度分别为81.6%和85.2%。结论Logistic回归分析模型和LightGBM算法模型对体检人群的未来糖尿病患病风险均有较好的预测效果。 Objective To explore the predictive effect and influencing factors of Logistic regression analysis model and Light GBM algorithm on the development of diabetes in the physical examination population.Methods A total of 36292 subjects without diabetes were selected from the Health Management Center of Nanfang Hospital from August 2003 to April 2019.We ramdomly selected 70%samples by stratification to construct trainingset.The independent variables were 34 indicators including gender,age,body mass index(BMI),waist circumference,heart rate,systolic blood pressure,diastolic blood pressure,and fasting blood glucose in the first physical examination.We defined the dependent variable as developing diabetes within 5 years from the first physical examination.Logistic regression analysis model and LightGBM(light gradient boosting machine)algorithm was uesd to establish diabetes prediction models,respectively.The prediction model was applied to the remaining 30%samples and the area under the receiver operating characteristic(ROC)curve(AUC)was used to evaluate the prediction effect.Results The AUC of the Logistic regression algorithm model was 0.906,while the AUC of the LightGBM analysis model was 0.910.At the optimal critical point,the sensitivity and specificity of the Logistic regression analysis model were 81.5%and 84.3%,respectively.And the sensitivity and specificity of the LightGBM analysis model were 81.6%and 85.2%,respectively.Conclusion The Logistic regression algorithm model and LightGBM algorithm model have good prediction effect on the development of diabetes in the physical examination population.
作者 欧阳平 李小溪 冷芬 赖晓英 张慧明 严传杰 王楚琼 白雨 邢志强 刘旭涛 缪苗 邓侃 李文源 OUYANG Ping;LI Xiao-xi;LENG Fen;LAI Xiao-ying;ZHANG Hui-ming;YAN Chuan-jie;WANG Chu-qiong;BAI Yu;XING Zhi-qiang;LIU Xu-tao;MIAO Miao;DENG Kan;LI Wen-yuan(Department of Health Management Section,Nanfang Hospital,Southern Medical University,Guangzhou 510515,China;Beijing Dudu Yida Technology Co.Ltd,Beijing 100192,China;Hospital Office,Nanfang Hospital,Southern Medical University,Guangzhou 510515,China)
出处 《中华疾病控制杂志》 CAS CSCD 北大核心 2021年第7期849-853,868,共6页 Chinese Journal of Disease Control & Prevention
关键词 糖尿病 体检 Logistic回归分析模型 LightGBM模型 Diabetes mellitus Physical examination Logistic regression analysis model LightGBM model
  • 相关文献

参考文献3

二级参考文献18

共引文献572

同被引文献126

引证文献14

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部