Parkinson’s disease(PD)is a chronic neurological condition that progresses over time.People start to have trouble speaking,writing,walking,or performing other basic skills as dopamine-generating neurons in some brain...Parkinson’s disease(PD)is a chronic neurological condition that progresses over time.People start to have trouble speaking,writing,walking,or performing other basic skills as dopamine-generating neurons in some brain regions are injured or die.The patient’s symptoms become more severe due to the worsening of their signs over time.In this study,we applied state-of-the-art machine learning algorithms to diagnose Parkinson’s disease and identify related risk factors.The research worked on the publicly available dataset on PD,and the dataset consists of a set of significant characteristics of PD.We aim to apply soft computing techniques and provide an effective solution for medical professionals to diagnose PD accurately.This research methodology involves developing a model using a machine learning algorithm.In the model selection,eight different machine learning techniques were adopted:Namely,Random Forest(RF),Decision Tree(DT),Support Vector Machine(SVM),Naïve Bayes(NB),Light Gradient Boosting Machine(LightGBM),K-Nearest Neighbours(KNN),Extreme Gradient Boosting(XGBoost),and Logistic Regression(LR).Subsequently,the concentrated models were validated through 10-fold Cross-Validation and Receiver Operating Characteristic(ROC)—Area Under the Curve(AUC).In addition,GridSearchCV was utilised to measure each algorithm’s best parameter;eventually,the models were trained through the hyperparameter tuning approach.With 98%accuracy,LightGBM had the highest accuracy in this study.RF,KNN,and SVM came in second with 96%accuracy.Furthermore,the performance scores of NB and LR were recorded to be 76%and 83%,respectively.It is to be mentioned that after applying 10-fold cross-validation,the average performance score of LightGBM accounted for 93%.At the same time,the percentage of ROC-AUC appeared at 0.92,which indicates that this LightGBM model reached a satisfactory level.Finally,we extracted meaningful insights and figured out potential gaps on top of PD.By extracting meaningful insights and identifying potential gaps,our study contributes to the significance and impact of PD research.The application of advanced machine learning algorithms holds promise in accurately diagnosing PD and shedding light on crucial aspects of the disease.This research has the potential to enhance the understanding and management of PD,ultimately improving the lives of individuals affected by this condition.展开更多
在数据挖掘领域中,不同分类器建立的模型性能不尽相同。对分类器性能的评价是选择优秀分类器的基础。为了更好地对分类器性能进行评估,文中对分类器性能评价标准进行了研究。分析了传统分类器性能评价标准在应用时存在的一些问题,重点...在数据挖掘领域中,不同分类器建立的模型性能不尽相同。对分类器性能的评价是选择优秀分类器的基础。为了更好地对分类器性能进行评估,文中对分类器性能评价标准进行了研究。分析了传统分类器性能评价标准在应用时存在的一些问题,重点介绍了ROC曲线(the Receiver Operating Characteristic curve)和AUC(the area under the ROC curve)评价方法,并剖析了它们的优缺点。对比分析表明,ROC曲线和AUC方法虽然存在着一定的不足,但是在分类器性能评价中所表现出的诱人性质使其必定具有广阔的应用前景。展开更多
准确率一直被作为分类器预测性能的主要评估标准,但是它存在着诸多的缺点和不足。本文将准确率与AUC(the area under the Receiver Operating Characteristic curve)进行了理论上的对比分析,并分别使用AUC和准确率对3种分类学习算法...准确率一直被作为分类器预测性能的主要评估标准,但是它存在着诸多的缺点和不足。本文将准确率与AUC(the area under the Receiver Operating Characteristic curve)进行了理论上的对比分析,并分别使用AUC和准确率对3种分类学习算法在15个两类数据集上进行了评估。综合理论和实验两个方面的结果,显示了AUC不但优于而且应该替代准确率,成为更好的分类器性能的评估度量。同时,用AUC对3种分类学习算法的重新评估,进一步证实了基于贝叶斯定理的Naive Bayes和TAN-CMI分类算法优于决策树分类算法C4.5。展开更多
基金The funding for thisworkwas provided by theResearch Groups Funding Program,Grant Code(NU/GP/SERC/13/30).
文摘Parkinson’s disease(PD)is a chronic neurological condition that progresses over time.People start to have trouble speaking,writing,walking,or performing other basic skills as dopamine-generating neurons in some brain regions are injured or die.The patient’s symptoms become more severe due to the worsening of their signs over time.In this study,we applied state-of-the-art machine learning algorithms to diagnose Parkinson’s disease and identify related risk factors.The research worked on the publicly available dataset on PD,and the dataset consists of a set of significant characteristics of PD.We aim to apply soft computing techniques and provide an effective solution for medical professionals to diagnose PD accurately.This research methodology involves developing a model using a machine learning algorithm.In the model selection,eight different machine learning techniques were adopted:Namely,Random Forest(RF),Decision Tree(DT),Support Vector Machine(SVM),Naïve Bayes(NB),Light Gradient Boosting Machine(LightGBM),K-Nearest Neighbours(KNN),Extreme Gradient Boosting(XGBoost),and Logistic Regression(LR).Subsequently,the concentrated models were validated through 10-fold Cross-Validation and Receiver Operating Characteristic(ROC)—Area Under the Curve(AUC).In addition,GridSearchCV was utilised to measure each algorithm’s best parameter;eventually,the models were trained through the hyperparameter tuning approach.With 98%accuracy,LightGBM had the highest accuracy in this study.RF,KNN,and SVM came in second with 96%accuracy.Furthermore,the performance scores of NB and LR were recorded to be 76%and 83%,respectively.It is to be mentioned that after applying 10-fold cross-validation,the average performance score of LightGBM accounted for 93%.At the same time,the percentage of ROC-AUC appeared at 0.92,which indicates that this LightGBM model reached a satisfactory level.Finally,we extracted meaningful insights and figured out potential gaps on top of PD.By extracting meaningful insights and identifying potential gaps,our study contributes to the significance and impact of PD research.The application of advanced machine learning algorithms holds promise in accurately diagnosing PD and shedding light on crucial aspects of the disease.This research has the potential to enhance the understanding and management of PD,ultimately improving the lives of individuals affected by this condition.
文摘在数据挖掘领域中,不同分类器建立的模型性能不尽相同。对分类器性能的评价是选择优秀分类器的基础。为了更好地对分类器性能进行评估,文中对分类器性能评价标准进行了研究。分析了传统分类器性能评价标准在应用时存在的一些问题,重点介绍了ROC曲线(the Receiver Operating Characteristic curve)和AUC(the area under the ROC curve)评价方法,并剖析了它们的优缺点。对比分析表明,ROC曲线和AUC方法虽然存在着一定的不足,但是在分类器性能评价中所表现出的诱人性质使其必定具有广阔的应用前景。
文摘准确率一直被作为分类器预测性能的主要评估标准,但是它存在着诸多的缺点和不足。本文将准确率与AUC(the area under the Receiver Operating Characteristic curve)进行了理论上的对比分析,并分别使用AUC和准确率对3种分类学习算法在15个两类数据集上进行了评估。综合理论和实验两个方面的结果,显示了AUC不但优于而且应该替代准确率,成为更好的分类器性能的评估度量。同时,用AUC对3种分类学习算法的重新评估,进一步证实了基于贝叶斯定理的Naive Bayes和TAN-CMI分类算法优于决策树分类算法C4.5。