COVID-19 is a contagious disease and its several variants put under stress in all walks of life and economy as well.Early diagnosis of the virus is a crucial task to prevent the spread of the virus as it is a threat t...COVID-19 is a contagious disease and its several variants put under stress in all walks of life and economy as well.Early diagnosis of the virus is a crucial task to prevent the spread of the virus as it is a threat to life in the whole world.However,with the advancement of technology,the Internet of Things(IoT)and social IoT(SIoT),the versatile data produced by smart devices helped a lot in overcoming this lethal disease.Data mining is a technique that could be used for extracting useful information from massive data.In this study,we used five supervised ML strategies for creating a model to analyze and forecast the existence of COVID-19 using the Kaggle dataset“COVID-19 Symptoms and Presence.”RapidMiner Studio ML software was used to apply the Decision Tree(DT),Random Forest(RF),K-Nearest Neighbors(K-NNs)and Naive Bayes(NB),Integrated Decision Tree(ID3)algorithms.To develop the model,the performance of each model was tested using 10-fold cross-validation and compared to major accuracy measures,Cohan’s kappa statistics,properly or mistakenly categorized cases and root means square error.The results demonstrate that DT outperforms other methods,with an accuracy of 98.42%and a root mean square error of 0.11.In the future,a devisedmodel will be highly recommendable and supportive for early prediction/diagnosis of disease by providing different data sets.展开更多
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2021R1A5A1021944 and 2021R1A5A1021944)supported by Kyungpook National University Research Fund,2020.
文摘COVID-19 is a contagious disease and its several variants put under stress in all walks of life and economy as well.Early diagnosis of the virus is a crucial task to prevent the spread of the virus as it is a threat to life in the whole world.However,with the advancement of technology,the Internet of Things(IoT)and social IoT(SIoT),the versatile data produced by smart devices helped a lot in overcoming this lethal disease.Data mining is a technique that could be used for extracting useful information from massive data.In this study,we used five supervised ML strategies for creating a model to analyze and forecast the existence of COVID-19 using the Kaggle dataset“COVID-19 Symptoms and Presence.”RapidMiner Studio ML software was used to apply the Decision Tree(DT),Random Forest(RF),K-Nearest Neighbors(K-NNs)and Naive Bayes(NB),Integrated Decision Tree(ID3)algorithms.To develop the model,the performance of each model was tested using 10-fold cross-validation and compared to major accuracy measures,Cohan’s kappa statistics,properly or mistakenly categorized cases and root means square error.The results demonstrate that DT outperforms other methods,with an accuracy of 98.42%and a root mean square error of 0.11.In the future,a devisedmodel will be highly recommendable and supportive for early prediction/diagnosis of disease by providing different data sets.