摘要
心脏病是一种常见的心血管疾病,对人类生命健康有极大的威胁,准确预测是否患有心脏病能够帮助心脏病的早发现、早治疗,提升心脏病患者的生活质量和寿命。本文以克利夫兰心脏病数据集为研究对象,首先对原始数据集进行数据变换、标准化处理等工作,将处理后的数据作为随机森林模型的输入进行训练,将预测结果与线性逻辑回归、K-最近邻、决策树等多种机器学习模型进行比较,结果表明本文模型在准确率、查准率、查全率、F1值、AUC值等5种性能评价指标上均优于对比的模型。最后,引入了SHAP模型加强预测模型的可解释性,并进行特征分析识别出影响心脏病的主要因素,为临床决策提供可参考的依据。
Heart disease is a common cardiovascular disease,which further poses threats to human health.Accurate prediction of heart disease can foster the early detection and treatment of heart disease,and furthermore improve the life quality and longevity of patients with heart disease.This study is based on the Cleveland heart disease dataset.In the research,on the basis of data transformation and normalization of the raw data set,the processed data are trained as the input of random forest model.The prediction results are compared with LR,KNN,decision tree and other machine learning models.The results show that the model is superior to the comparison model in five performance evaluation indexes,such as accuracy,precision,recall,F1-score and AUC.Therefore,the SHAP model is introduced to enhance the interpretability of prediction model,and the main factors affecting heart disease are identified by feature analysis,providing a reference basis for clinical decision making.
作者
程祉元
张博良
蔡雨晨
马雨生
邵泽国
刘巧红
CHENG Zhiyuan;ZHANG Boliang;CAI Yuchen;MA Yusheng;SHAO Zeguo;LIU Qiaohong(School of Medical Instrumentation,Shanghai University of Medicine and Health Sciences,Shanghai 201318,China)
出处
《智能计算机与应用》
2023年第11期172-179,共8页
Intelligent Computer and Applications
基金
国家自然科学基金(61801288)
上海市科委科技创新行动计划项目(22DZ2305300)
国家社会科学基金(20BTQ073)。