期刊文献+

基于社区大数据的骨关节炎患病风险XGboost预测模型研究 被引量:2

XGboost prediction model for osteoarthritis risk based on community big data
原文传递
导出
摘要 目的探索社区医疗大数据和机器学习模型构建骨关节炎风险预警模型,为社区骨关节炎的早期预警提供定量工具,以期为老年人骨关节炎防治提供更高效的管理方法。方法集成2019年1月1日—12月31日上海6家社区卫生服务中心2019年健康档案、健康体检和诊疗数据形成包含4万多个样本和126个变量的原始数据库,经过数据预处理和复合特征选择筛选入模特征,采用XGBoost算法构建骨关节炎患者风险评估模型。结果本研究筛选纳入模型14个,包括饮食是否荤素均衡、身高、体重、BMI、每次锻炼时间、总胆固醇、高密度脂蛋白、低密度脂蛋白、是否患有高血压、是否有肢体外伤等。其中重要性排名前5位的特征因素分别为高密度脂蛋白、总胆固醇、BMI、低密度脂蛋白、饮酒频率,其特征重要度均超过0.1。以“是否骨关节炎”作为输出变量,特征工程筛选后的14个特征作为输入变量,构建骨关节炎风险评估的XGBoost模型,采用8折交叉验证的方法训练后,在测试集上验证模型准确率为92%,精确率为71%,召回率为65%,F1_score为0.68,AUC达到0.82,KS值为0.48。结论本研究采用社区医疗大数据构建了骨关节炎风险预警模型,模型的整体拟合度和特征合理性较好,为社区骨关节炎的早期预警提供了工具,有利于社区骨关节炎的早诊早治。 Objective To explore the construction of osteoarthritis risk warning model by community medical big data and machine learning model,provide a quantitative tool for the early warning of osteoarthritis in the community,to provide an efficient management method for the prevention and treatment of osteoarthritis in the elderly.Methods The data of health records,health examinations and diagnosis and treatment data of six community health service centres in Shanghai from January 1,2019 to December 31,2019,were integrated to form an original database containing more than 40000 samples and 126 variables.After data pre-processing and compound feature selection to screen the model characteristics,the XGBoost algorithm was used to construct a risk assessment model for osteoarthritis patients.Results Fourteen characteristics were screened in this study:diet with balanced meat and vegetables,height,weight,body mass index(BMI),time of each exercise,total cholesterol,high-density lipoprotein,low-density lipoprotein,hypertension,limb trauma,etc.High-density lipoprotein,total cholesterol,BMI,low-density lipoprotein and frequency of drinking were the top five characteristic factors in importance ranking,and their characteristic importance was more than 0.1.The XGBoost model of osteoarthritis risk assessment was constructed with’osteoarthritis’as the output variable,and 14 features were screened by feature engineering as the input variable.After the XGBoost model was trained by eightfold cross-validation,the model was validated on the test set with an accuracy rate of 92%,a precision rate of 71%and recall rate of 65%,F1_score was 0.68,the area under the receiver operating characteristic curve reached 0.82,and the KS value was 0.48.Conclusion In this study,a risk warning model of osteoarthritis is constructed using community medical big data,and the overall fit and feature rationality of the model are good,which provides a tool for the early warning of osteoarthritis in the community and is conducive to the early diagnosis and treatment of osteoarthritis in the community.
作者 李丽秋 许成燕 王晓丽 曹永其 李言 赵亮 王朝昕 贾环 LI Li-qiu;XU Cheng-yan;WANG Xiao-li;CAO Yong-qi;LI Yan;ZHAO Liang;WANG Zhao-xin;JIA Huan(Zhuanqiao Community Health Service Center,Minhang District,Shanghai 201108,China;不详)
出处 《中华全科医学》 2022年第12期2080-2083,2167,共5页 Chinese Journal of General Practice
基金 国家自然科学基金面上项目(71774116) 上海浦东新区2018、2019年度卫生科技项目(PW2019A-42) 上海市交通大学中国医院发展研究院项目(CHDI-2021-B-08) 上海市“医苑新星”青年医学人才培养资助计划(沪卫人事2020087号) 上海市闵行区自然科学基金项目(2020MHZ082)。
关键词 社区 骨关节炎 风险预测 大数据 Community Osteoarthritis Risk prediction Big data
  • 相关文献

参考文献17

二级参考文献108

共引文献2225

同被引文献30

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部