摘要
血管疾病严重威胁着人类的健康,高发病率、高致残率、高死亡率是心血管疾病的主要特点,因此心血管疾病的预测研究显得尤为重要。本文探讨了随机森林算法在心血管疾病预测中的应用效果。在Kaggle网站上下载关于心血管疾病的数据集,用随机森林算法进行训练,实验结果由准确性、精度、召回率、F1-score评价标准来评价其性能的好坏(评价就包括好坏)。本文将其与逻辑回归(Logistic Regression)、K近邻分类器(K-nearest neighbor classifier)、支持向量机(SVM)进行了比较,实验结果表明,随机森林算法的性能优于其他算法,其准确率为73.55,精度为75.51,召回率为70.11,F1-Score为72.71。通过基尼重要性评价能从多因素中识别出影响心血管疾病的重要因素,这意味着随机森林算法在心血管疾病预测中具有较大的优势,从而对心血管疾病的预测研究和早期病人的及时有效治疗具有重要意义。
Cardiovascular disease is a serious threat to human health,high incidence,high disability rate and high mortality are its main characteristics,thus cardiovascular disease prediction research is particularly important.This paper discusses the effect of stochastic forest algorithm application in cardiovascular disease prediction.Concerning datasets of cardiovascular disease were downloaded from Kaggle and trained using a random forest algorithm,whose performance was evaluated by accuracy,accuracy,recall,and F1-score.In this paper,we compare its result with Logistic Regression,K-nearest neighbor classifier and Support Vector Machine.The experimental result shows that the performance of random forest algorithm is better than other algorithms,the accuracy is73.55,the precision is 75.51,the recall rate is 70.11 and F1-Score is 72.71.By Gini importance evaluation,the important factors affecting cardiovascular disease can be identified from multi-factors,which means the stochastic forest algorithm has a great advantage in cardiovascular disease prediction.And this is of great significance for the prediction of cardiovascular disease and the timely and effective treatment of early patients.
作者
石胜源
朱磊
叶琳
罗铁清
SHI Shengyuan;ZHU Lei;YE Lin;LUO Tieqing(School of Informatics,Hunan University of Chinese Medicine,Changsha 410208,China)
出处
《智能计算机与应用》
2021年第4期176-178,181,共4页
Intelligent Computer and Applications
关键词
随机森林
心血管疾病
疾病预测
Random forest
Cardiovascular disease
Disease prediction