Background:Nonalcoholic fatty liver disease(NAFLD)is a public health challenge and significant cause of morbidity and mortality worldwide.Early identification is crucial for disease intervention.We recently proposed a...Background:Nonalcoholic fatty liver disease(NAFLD)is a public health challenge and significant cause of morbidity and mortality worldwide.Early identification is crucial for disease intervention.We recently proposed a nomogram-based NAFLD prediction model from a large population cohort.We aimed to explore machine learning tools in predicting NAFLD.Methods:A retrospective cross-sectional study was performed on 15315 Chinese subjects(10373 training and 4942 testing sets).Selected clinical and biochemical factors were evaluated by different types of machine learning algorithms to develop and validate seven predictive models.Nine evaluation indicators including area under the receiver operating characteristic curve(AUROC),area under the precision-recall curve(AUPRC),accuracy,positive predictive value,sensitivity,F1 score,Matthews correlation coefficient(MCC),specificity and negative prognostic value were applied to compare the performance among the models.The selected clinical and biochemical factors were ranked according to the importance in prediction ability.Results:Totally 4018/10373(38.74%)and 1860/4942(37.64%)subjects had ultrasound-proven NAFLD in the training and testing sets,respectively.Seven machine learning based models were developed and demonstrated good performance in predicting NAFLD.Among these models,the XGBoost model revealed the highest AUROC(0.873),AUPRC(0.810),accuracy(0.795),positive predictive value(0.806),F1 score(0.695),MCC(0.557),specificity(0.909),demonstrating the best prediction ability among the built models.Body mass index was the most valuable indicator to predict NAFLD according to the feature ranking scores.Conclusions:The XGBoost model has the best overall prediction ability for diagnosing NAFLD.The novel machine learning tools provide considerable beneficial potential in NAFLD screening.展开更多
基金supported by grants from the National Natural Science Foundation of China(81970543 and 81570591)Zhejiang Provincial Medical&Hygienic Science and Technology Project of China(2018KY385)Zhejiang Provincial Natural Science Foundation of China(LY20H160023)。
文摘Background:Nonalcoholic fatty liver disease(NAFLD)is a public health challenge and significant cause of morbidity and mortality worldwide.Early identification is crucial for disease intervention.We recently proposed a nomogram-based NAFLD prediction model from a large population cohort.We aimed to explore machine learning tools in predicting NAFLD.Methods:A retrospective cross-sectional study was performed on 15315 Chinese subjects(10373 training and 4942 testing sets).Selected clinical and biochemical factors were evaluated by different types of machine learning algorithms to develop and validate seven predictive models.Nine evaluation indicators including area under the receiver operating characteristic curve(AUROC),area under the precision-recall curve(AUPRC),accuracy,positive predictive value,sensitivity,F1 score,Matthews correlation coefficient(MCC),specificity and negative prognostic value were applied to compare the performance among the models.The selected clinical and biochemical factors were ranked according to the importance in prediction ability.Results:Totally 4018/10373(38.74%)and 1860/4942(37.64%)subjects had ultrasound-proven NAFLD in the training and testing sets,respectively.Seven machine learning based models were developed and demonstrated good performance in predicting NAFLD.Among these models,the XGBoost model revealed the highest AUROC(0.873),AUPRC(0.810),accuracy(0.795),positive predictive value(0.806),F1 score(0.695),MCC(0.557),specificity(0.909),demonstrating the best prediction ability among the built models.Body mass index was the most valuable indicator to predict NAFLD according to the feature ranking scores.Conclusions:The XGBoost model has the best overall prediction ability for diagnosing NAFLD.The novel machine learning tools provide considerable beneficial potential in NAFLD screening.