Cardiovascular disease (CVD) is a leading cause of death across the globe. Approximately 17.9 million of people die globally each year due to CVD, </span><span style="font-family:Verdana;">which ...Cardiovascular disease (CVD) is a leading cause of death across the globe. Approximately 17.9 million of people die globally each year due to CVD, </span><span style="font-family:Verdana;">which comprises 31% of all death. Coronary Artery Disease (CAD) is a common</span><span style="font-family:Verdana;"> type of CVD and is considered fatal.</span></span><span style="font-family:""> </span><span style="font-family:Verdana;">Predictive models that use machine learning algorithms may assist health workers in timely detection of CAD which ultimately reduce</span><span style="font-family:Verdana;">s</span><span style="font-family:Verdana;"> the mortality.</span><span style="font-family:""> </span><span style="font-family:""><span style="font-family:Verdana;">The main purpose of this study is to build a predictive model that provides doctors and health care providers with personalized information to implement better and more personalized treat</span><span style="font-family:Verdana;">ments for their patients. In</span></span><span style="font-family:""> </span><span style="font-family:Verdana;">this study, we use the publicly available Z-Alizadeh</span><span style="font-family:Verdana;"> Sani dataset which contains random samples of 216 cases with CAD and 87 normal controls with 56 different features. The binary variable “Cath” which represents case-control status, is used the target variable. We study its relationship with other predictors and develop classification models using the five different supervised classification machine learning algorithms: Logistic Regression (LR), Classification Tree</span><span style="font-family:""> </span><span style="font-family:Verdana;">with</span><span style="font-family:""> </span><span style="font-family:""><span style="font-family:Verdana;">Bagging (Bagging CART), </span><span style="font-family:Verdana;">Random </span><span style="font-family:Verdana;">Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN).</span><span style="font-family:Verdana;"> These five classification models are used to investigate the detection of CAD. Finally, the performance of the machine learning algorithms is compared,</span></span><span style="font-family:""> </span><span style="font-family:Verdana;">and the best model is selected. Our results indicate that the SVM model is able to predict the presence of CAD more effectively and accurately than other models with an accuracy of 0.8947, sensitivity of 0.9434, specificity of 0.7826, and AUC of 0.8868.展开更多
文摘Cardiovascular disease (CVD) is a leading cause of death across the globe. Approximately 17.9 million of people die globally each year due to CVD, </span><span style="font-family:Verdana;">which comprises 31% of all death. Coronary Artery Disease (CAD) is a common</span><span style="font-family:Verdana;"> type of CVD and is considered fatal.</span></span><span style="font-family:""> </span><span style="font-family:Verdana;">Predictive models that use machine learning algorithms may assist health workers in timely detection of CAD which ultimately reduce</span><span style="font-family:Verdana;">s</span><span style="font-family:Verdana;"> the mortality.</span><span style="font-family:""> </span><span style="font-family:""><span style="font-family:Verdana;">The main purpose of this study is to build a predictive model that provides doctors and health care providers with personalized information to implement better and more personalized treat</span><span style="font-family:Verdana;">ments for their patients. In</span></span><span style="font-family:""> </span><span style="font-family:Verdana;">this study, we use the publicly available Z-Alizadeh</span><span style="font-family:Verdana;"> Sani dataset which contains random samples of 216 cases with CAD and 87 normal controls with 56 different features. The binary variable “Cath” which represents case-control status, is used the target variable. We study its relationship with other predictors and develop classification models using the five different supervised classification machine learning algorithms: Logistic Regression (LR), Classification Tree</span><span style="font-family:""> </span><span style="font-family:Verdana;">with</span><span style="font-family:""> </span><span style="font-family:""><span style="font-family:Verdana;">Bagging (Bagging CART), </span><span style="font-family:Verdana;">Random </span><span style="font-family:Verdana;">Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN).</span><span style="font-family:Verdana;"> These five classification models are used to investigate the detection of CAD. Finally, the performance of the machine learning algorithms is compared,</span></span><span style="font-family:""> </span><span style="font-family:Verdana;">and the best model is selected. Our results indicate that the SVM model is able to predict the presence of CAD more effectively and accurately than other models with an accuracy of 0.8947, sensitivity of 0.9434, specificity of 0.7826, and AUC of 0.8868.