摘要
目的运用机器学习算法构建原发性高血压并发视网膜病变风险的预测模型。方法选取2020年3月至2022年3月在中国人民解放军联勤保障部队第908医院体检中心确诊的原发性高血压患者402例,其中原发性高血压并发视网膜病变患者201例(观察组),单纯原发性高血压患者201例(对照组)。收集2组患者34个相关研究指标作为原发性高血压并发视网膜病变的可能影响因素,并采用单因素分析、Spearman相关系数及最小绝对收缩和选择算子方法(Lasso回归)筛选变量后,将所有研究对象按7:3随机分为训练集和测试集,在训练集中运用机器学习算法构建支持向量机(SVM)、K邻近(KNN)、分类决策树(DecisionTree)、随机森林(RF)、极端随机树(ExtraTrees)、XGBoost及LightGBM预测模型,在测试集中进行验证。运用准确率、AUC值、敏感性及特异性对模型进行评价。结果经单因素分析、Spearman相关系数及Lasso回归筛选出19个变量,构建了SVM、KNN、DecisionTree、RF、ExtraTrees、XGBoost、LightGBM预测模型。其中综合性能最高的为ExtraTrees模型,其准确率高达0.96,AUC值高达0.997。结论基于机器学习算法构建的原发性高血压并发视网膜病变的SVM、KNN、DecisionTree、RF、ExtraTrees、XGBoost及LightGBM预测模型中,ExtraTrees模型预测效果最好,可作为辅助诊断工具应用到高血压视网膜病变的筛查中,可能为今后早期高血压视网膜病变的筛查提供便利。
Objective To construct a riskprediction model for primary hypertension complicated by retinopathy using machine learning algorithm.Methods A total of 402 patients with essential hypertension diagnosed in the physical examination center of the 908th Hospital of the Chinese People’s Liberation Army Joint Logistic Support Force from March 2020 to March 2022 were selected,including 201 patients with essential hypertension complicated by retinopathy(observation group)and 201 patients with simple essential hypertension(control group).Thirty-fourrelated indicators were collected as the possible influencing factors for essential hypertension complicated by retinopathy.After variables were screened by univariate analysis,Spearman correlation coefficient,least absolute shrinkage and selection operator(Lasso regression),all the subjects were randomly divided into training set and test set according to 7:3.In the training set,machine learning algorithm was used to construct support vector machines(SVM),K-nearest Neighbor(KNN),DecisionTree,RandomForest(RF),extremely randomized trees(ExtraTrees),XGBoost and LightGBM prediction models.The validation was performed in the test set.The accuracy,AUC,sensitivity and specificity were used to evaluate the models.Results Nineteen variables were selected by univariate analysis,Spearman correlation coefficient and Lasso regression,and the SVM,KNN,DecisionTree,RF,ExtraTrees,XGBoost and LightGBM prediction models were constructed.The ExtraTrees model has the best comprehensive performance,with an accuracy of 0.96 and an AUC of 0.997.Conclusion Among the SVM,KNN,DecisionTree,RF,ExtraTrees,XGBoost and LightGBM prediction models constructed based on machine learning algorithm,ExtraTrees has the optimal prediction efficacy.This modelcan be used as an auxiliary diagnostic tool that provides convenience for early screening of hypertensive retinopathy.
作者
秦伟国
淦帆
殷波
徐婷
邓武昌
刘彬
朱良炎
龚攀
许国安
周水莲
QIN Wei-guo;GAN Fan;YIN Bo;XU Ting;DENG Wu-chang;LIU Bin;ZHU Liang-yan;GONG Pan;XU Guo-an;ZHOU Shui-lian(Department of Cardiothoracic Surgery,the 908 th Hospital of Chinese People’s Liberation Army Joint Logistic Support Force,Nanchang 330002,China;Department of Ophthalmology,Jiangxi Provincial People’s Hospital,the First Affiliated Hospital of Nanchang Medical College,Nanchang 330006,China;Department of General,Yongxiu County Makou Town Center Health,Jiujiang 330304,China)
出处
《南昌大学学报(医学版)》
2023年第5期49-54,80,共7页
Journal of Nanchang University:Medical Sciences
关键词
机器学习算法
高血压
视网膜病变
预测模型
machine learning algorithm
hypertension
retinopathy
prediction model