摘要
心脏病对人体的危害极大,甚至会危及人们的生命。相比于医院检测,使用机器学习方法预测心脏病,可以节约大量的时间。本文以Kaggle心脏病数据集中的1025条真实心脏病数据为例,分析了引起心脏病的相关因素,并构建了K近邻、决策树、随机森林、逻辑回归四种不同的分类算法模型,对心脏病进行预测。以混淆矩阵、准确率、召回率、精确率、ROC曲线和AUC值作为模型的评价指标,发现K近邻和随机森林的预测效果更好,从而为心脏病预测和诊断提供了有效的科学依据。Heart disease poses great harm to the human body, even endangering people’s lives. Compared to hospital testing, using machine learning methods to predict heart disease can save a lot of time. This article takes 1025 real heart disease data in the Kaggle heart disease dataset as examples to analyze the relevant factors that cause heart disease, and constructs four different classification algorithm models: K-nearest neighbor, decision tree, random forest, and logistic regression to predict heart disease. Using confusion matrix, accuracy, recall, precision, ROC curve, and AUC value as evaluation indicators for the model, it was found that K-nearest neighbor and random forest had better prediction performance, providing an effective scientific basis for heart disease prediction and diagnosis.
出处
《应用数学进展》
2024年第10期4610-4622,共13页
Advances in Applied Mathematics