In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroi...In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce overtreatment.However,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and transparency.This paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present state-of-the-artmodels.Our study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction models.In the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the dataset.The original dataset is used in trainingmachine learning models,and further used in generating SHAP values fromthesemodels.In the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based analysis.This new integrated dataset is used in re-training the machine learning models.The new SHAP values generated from these models help in validating the contributions of feature sets in predicting malignancy.The conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making systems.In this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the predictions.The study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of explainability.The proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area under the receiver operating characteristic(AUROC)are also higher than the baseline models.The results of the proposed model help us identify the dominant feature sets that impact thyroid cancer classification and prediction.The features{calcification}and{shape}consistently emerged as the top-ranked features associated with thyroid malignancy,in both association-rule based interestingnessmetric values and SHAPmethods.The paper highlights the potential of the rule-based integrated models with SHAP in bridging the gap between the machine learning predictions and the interpretability of this prediction which is required for real-world medical applications.展开更多
The sampling of the training data is a bottleneck in the development of artificial intelligence(AI)models due to the processing of huge amounts of data or to the difficulty of access to the data in industrial practice...The sampling of the training data is a bottleneck in the development of artificial intelligence(AI)models due to the processing of huge amounts of data or to the difficulty of access to the data in industrial practices.Active learning(AL)approaches are useful in such a context since they maximize the performance of the trained model while minimizing the number of training samples.Such smart sampling methodologies iteratively sample the points that should be labeled and added to the training set based on their informativeness and pertinence.To judge the relevance of a data instance,query rules are defined.In this paper,we propose an AL methodology based on a physics-based query rule.Given some industrial objectives from the physical process where the AI model is implied in,the physics-based AL approach iteratively converges to the data instances fulfilling those objectives while sampling training points.Therefore,the trained surrogate model is accurate where the potentially interesting data instances from the industrial point of view are,while coarse everywhere else where the data instances are of no interest in the industrial context studied.展开更多
目的:系统评价急性缺血性脑卒中相关性肺炎评分(AIS-APS)评分对缺血性脑卒中病人卒中相关性肺炎(SAP)的预测价值。方法:检索中国知网、万方数据库、维普数据库、中国生物医学文献数据库、PubMed、Web of Science、EMbase、the Cochrane ...目的:系统评价急性缺血性脑卒中相关性肺炎评分(AIS-APS)评分对缺血性脑卒中病人卒中相关性肺炎(SAP)的预测价值。方法:检索中国知网、万方数据库、维普数据库、中国生物医学文献数据库、PubMed、Web of Science、EMbase、the Cochrane Library、Wiley等数据库关于使用AIS-APS评分预测缺血性脑卒中发生SAP风险的相关文献,检索时限为建库至2023年5月31日。采用诊断准确性研究质量评估工具(QUADAS-2)进行文献质量评价,运用Stata 17.0软件进行Meta分析。结果:最终纳入14篇文献进行Meta分析,涉及7117例病人。Meta分析结果显示,AIS-APS预测缺血性脑卒中病人发生SAP风险合并灵敏度为0.82[95%CI(0.74,0.88)],合并特异度为0.73[95%CI(0.66,0.80)],合并阳性似然比为3.08[95%CI(2.53,3.76)],合并阴性似然比为0.25[95%CI(0.18,0.34)],合并DOR为2.52[95%CI(2.20,2.84)],合并优势比为12.40[95%CI(9.01,17.06)]。AIS-APS预测缺血性脑卒中SAP的综合受试者工作特征曲线(SROC)的曲线下面积(AUC)为0.84[95%CI(0.81,0.87)]。Deek′s漏斗图分析显示,纳入文献无发表偏倚(P=0.73),范根图显示该评分在临床适用性良好。结论:现有证据表明,AIS-APS评分对缺血性脑卒中病人发生SAP风险具有一定的预测价值,可对临床病人进行初步筛查,识别发生SAP高风险病人,以便做出进一步的预防与治疗。展开更多
文摘In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce overtreatment.However,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and transparency.This paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present state-of-the-artmodels.Our study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction models.In the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the dataset.The original dataset is used in trainingmachine learning models,and further used in generating SHAP values fromthesemodels.In the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based analysis.This new integrated dataset is used in re-training the machine learning models.The new SHAP values generated from these models help in validating the contributions of feature sets in predicting malignancy.The conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making systems.In this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the predictions.The study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of explainability.The proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area under the receiver operating characteristic(AUROC)are also higher than the baseline models.The results of the proposed model help us identify the dominant feature sets that impact thyroid cancer classification and prediction.The features{calcification}and{shape}consistently emerged as the top-ranked features associated with thyroid malignancy,in both association-rule based interestingnessmetric values and SHAPmethods.The paper highlights the potential of the rule-based integrated models with SHAP in bridging the gap between the machine learning predictions and the interpretability of this prediction which is required for real-world medical applications.
文摘The sampling of the training data is a bottleneck in the development of artificial intelligence(AI)models due to the processing of huge amounts of data or to the difficulty of access to the data in industrial practices.Active learning(AL)approaches are useful in such a context since they maximize the performance of the trained model while minimizing the number of training samples.Such smart sampling methodologies iteratively sample the points that should be labeled and added to the training set based on their informativeness and pertinence.To judge the relevance of a data instance,query rules are defined.In this paper,we propose an AL methodology based on a physics-based query rule.Given some industrial objectives from the physical process where the AI model is implied in,the physics-based AL approach iteratively converges to the data instances fulfilling those objectives while sampling training points.Therefore,the trained surrogate model is accurate where the potentially interesting data instances from the industrial point of view are,while coarse everywhere else where the data instances are of no interest in the industrial context studied.