Objective:To establish a stroke prediction and feature analysis model integrating XGBoost and SHAP to aid the clinical diagnosis and prevention of stroke.Methods:Based on the open data set on Kaggle,with the help of d...Objective:To establish a stroke prediction and feature analysis model integrating XGBoost and SHAP to aid the clinical diagnosis and prevention of stroke.Methods:Based on the open data set on Kaggle,with the help of data preprocessing and grid parameter optimization,an interpretable stroke risk prediction model was established by integrating XGBoost and SHAP and an explanatory analysis of risk factors was performed.Results:The XGBoost model’s accuracy,sensitivity,specificity,and area under the receiver operating characteristic(ROC)curve(AUC)were 96.71%,93.83%,99.59%,and 99.19%,respectively.Our explanatory analysis showed that age,type of residence,and history of hypertension were key factors affecting the incidence of stroke.Conclusion:Based on the data set,our analysis showed that the established model can be used to identify stroke,and our explanatory analysis based on SHAP increases the transparency of the model and facilitates medical practitioners to analyze the reliability of the model.展开更多
Stroke is a life-threatening disease usually due to blockage of blood or insufficient blood flow to the brain.It has a tremendous impact on every aspect of life since it is the leading global factor of disability and ...Stroke is a life-threatening disease usually due to blockage of blood or insufficient blood flow to the brain.It has a tremendous impact on every aspect of life since it is the leading global factor of disability and morbidity.Strokes can range from minor to severe(extensive).Thus,early stroke assessment and treatment can enhance survival rates.Manual prediction is extremely time and resource intensive.Automated prediction methods such as Modern Information and Communication Technologies(ICTs),particularly those inMachine Learning(ML)area,are crucial for the early diagnosis and prognosis of stroke.Therefore,this research proposed an ensemble voting model based on three Machine Learning(ML)algorithms:Random Forest(RF),Extreme Gradient Boosting(XGBoost),and Light Gradient Boosting Machine(LGBM).We apply data preprocessing to manage the outliers and useless instances in the dataset.Furthermore,to address the problem of imbalanced data,we enhance the minority class’s representation using the Synthetic Minority Over-Sampling Technique(SMOTE),allowing it to engage in the learning process actively.Results reveal that the suggested model outperforms existing studies and other classifiers with 0.96%accuracy,0.97%precision,0.97%recall,and 0.96%F1-score.The experiment demonstrates that the proposed ensemble voting model outperforms state-of-the-art and other traditional approaches.展开更多
Objective To explore the relationship between risk of stroke and calcaneal quantitative ultrasound(QUS)T score under-2.5.Methods 5 847 subjects over the age of 40 from Yunyan District,Guiyang City were investigated wi...Objective To explore the relationship between risk of stroke and calcaneal quantitative ultrasound(QUS)T score under-2.5.Methods 5 847 subjects over the age of 40 from Yunyan District,Guiyang City were investigated with questionnaire,physical examination,blood lipids,other metabolic indexes and calcaneus bone展开更多
基金the National Natural Science Foundation Project(Grant No.61863027)the Special Research Project on High Quality Development of Innovation and Entrepreneurship Education of the Chinese Society of Higher Education(Grant No.21CXD01)the Key R&D Plan of Jiangxi Province(Grant No.20202BBGL73057).
文摘Objective:To establish a stroke prediction and feature analysis model integrating XGBoost and SHAP to aid the clinical diagnosis and prevention of stroke.Methods:Based on the open data set on Kaggle,with the help of data preprocessing and grid parameter optimization,an interpretable stroke risk prediction model was established by integrating XGBoost and SHAP and an explanatory analysis of risk factors was performed.Results:The XGBoost model’s accuracy,sensitivity,specificity,and area under the receiver operating characteristic(ROC)curve(AUC)were 96.71%,93.83%,99.59%,and 99.19%,respectively.Our explanatory analysis showed that age,type of residence,and history of hypertension were key factors affecting the incidence of stroke.Conclusion:Based on the data set,our analysis showed that the established model can be used to identify stroke,and our explanatory analysis based on SHAP increases the transparency of the model and facilitates medical practitioners to analyze the reliability of the model.
文摘Stroke is a life-threatening disease usually due to blockage of blood or insufficient blood flow to the brain.It has a tremendous impact on every aspect of life since it is the leading global factor of disability and morbidity.Strokes can range from minor to severe(extensive).Thus,early stroke assessment and treatment can enhance survival rates.Manual prediction is extremely time and resource intensive.Automated prediction methods such as Modern Information and Communication Technologies(ICTs),particularly those inMachine Learning(ML)area,are crucial for the early diagnosis and prognosis of stroke.Therefore,this research proposed an ensemble voting model based on three Machine Learning(ML)algorithms:Random Forest(RF),Extreme Gradient Boosting(XGBoost),and Light Gradient Boosting Machine(LGBM).We apply data preprocessing to manage the outliers and useless instances in the dataset.Furthermore,to address the problem of imbalanced data,we enhance the minority class’s representation using the Synthetic Minority Over-Sampling Technique(SMOTE),allowing it to engage in the learning process actively.Results reveal that the suggested model outperforms existing studies and other classifiers with 0.96%accuracy,0.97%precision,0.97%recall,and 0.96%F1-score.The experiment demonstrates that the proposed ensemble voting model outperforms state-of-the-art and other traditional approaches.
文摘Objective To explore the relationship between risk of stroke and calcaneal quantitative ultrasound(QUS)T score under-2.5.Methods 5 847 subjects over the age of 40 from Yunyan District,Guiyang City were investigated with questionnaire,physical examination,blood lipids,other metabolic indexes and calcaneus bone