Background:Most patients with advanced non-small cell lung cancer(NSCLC)have a poor prognosis.Predicting overall survival using clinical data would benefit cancer patients by allowing providers to design an optimum tr...Background:Most patients with advanced non-small cell lung cancer(NSCLC)have a poor prognosis.Predicting overall survival using clinical data would benefit cancer patients by allowing providers to design an optimum treatment plan.We compared the performance of nomograms with machine-learning models at predicting the overall survival of NSCLC patients.This comparison benefits the development and selection of models during the clinical decision-making process for NSCLC patients.Methods:Multiple machine-learning models were used in a retrospective cohort of 6586 patients.First,we modeled and validated a nomogram to predict the overall survival of NSCLC patients.Subsequently,five machine-learning models(logistic regression,random forest,XGBoost,decision tree,and light gradient boosting machine)were used to predict survival status.Next,we evaluated the performance of the models.Finally,the machine-learning model with the highest accuracy was chosen for comparison with the nomogram at predicting survival status by observing a novel performance measure:time-dependent prediction accuracy.Results:Among the five machine-learning models,the accuracy of random forest model outperformed the others.Compared with the nomogram for time-dependent prediction accuracy with a follow-up time ranging from 12 to 60 months,the prediction accuracies of both the nomogram and machinelearning models changed as time varied.The nomogram reached a maximum prediction accuracy of 0.85 in the 60th month,and the random forest algorithm reached a maximum prediction accuracy of 0.74 in the 13th month.Conclusions:Overall,the nomogram provided more reliable prognostic assessments of NSCLC patients than machine-learning models over our observation period.Although machine-learning methods have been widely adopted for predicting clinical prognoses in recent studies,the conventional nomogram was competitive.In real clinical applications,a comprehensive model that combines these two methods may demonstrate superior capabilities.展开更多
Background:Kidney cancer originates from the urinary tubule epithelial system of the renal parenchyma,accounting for 20% of all urinary system tumors.Approximately 70% of cases are localized at diagnosis,and 30%are me...Background:Kidney cancer originates from the urinary tubule epithelial system of the renal parenchyma,accounting for 20% of all urinary system tumors.Approximately 70% of cases are localized at diagnosis,and 30%are metastatic.Most localized kidney cancers can be cured by surgery,but most metastatic patients relapse after surgery and eventually die of kidney cancer.Therefore,accurately predicting patient survival and identifying high-risk metastatic patients will effectively guide interventions and improve prognosis.Methods:This study used the data of 12,394 kidney cancer patients from the surveillance,epidemiology,and end results database to construct a research cohort related to kidney cancer survival and metastasis.Eight machine learning models(including support vector machines,logistic regression,decision tree,random forest,XGBoost,AdaBoost,K-nearest neighbors,and multilayer perceptron)were developed to predict the survival and metastasis of kidney cancer and six evaluation indicators(accuracy,precision,sensitivity,specificity,F1 score,and area under the receiver operating characteristic[AUROC])were used to verify,evaluate,and optimize the models.Results:Among the eight machine learning models,Logistic Regression has the highest AUROC in both prediction scenarios.For 3-year survival prediction,the Logistic Regression model had an accuracy of 0.684,a sensitivity of 0.702,a specificity of 0.670,a precision of 0.686,an F1 score of 0.683,and an AUROC of 0.741.For tumor metastasis prediction,the Logistic Regression model had an accuracy of 0.800,a sensitivity of 0.540,a specificity of 0.830,a precision of 0.769,an F1 score of 0.772,and an AUROC of 0.804.Conclusion:In this study,we selected appropriate variables from both statistical and clinical significance and developed and compared eight machine learning models for predicting 3-year survival and metastasis of kidney cancer.The prediction results and evaluation results demonstrated that our model could provide decision support for early intervention for kidney cancer patients.展开更多
Cancer informatics has significantly progressed in the big data era.We summarize the application of informatics approaches to the cancer domain from both the informatics perspective(e.g.,data management and data scien...Cancer informatics has significantly progressed in the big data era.We summarize the application of informatics approaches to the cancer domain from both the informatics perspective(e.g.,data management and data science)and the clinical perspective(e.g.,cancer screening,risk assessment,diagnosis,treatment,and prognosis).We discuss various informatics methods and tools that are widely applied in cancer research and practices,such as cancer databases,data standards,terminologies,high‐throughput omics data mining,machine‐learning algorithms,artificial intelligence imaging,and intelligent radiation.We also address the informatics challenges within the cancer field that pursue better treatment decisions and patient outcomes,and focus on how informatics can provide opportunities for cancer research and practices.Finally,we conclude that the interdisciplinary nature of cancer informatics and collaborations are major drivers for future research and applications in clinical practices.It is hoped that this review is instrumental for cancer researchers and clinicians with its informatics‐specific insights.展开更多
基金Novel Coronavirus Infection and Prevention Emergency Scientific Research Special Project of the Chongqing Municipal Education Commission,China,Grant/Award Number:CQEO[2020]no.13Chongqing Performance Incentive and Guidance Project for Scientific Research Institutions,Grant/Award Number:cstc2020jxjl130016+1 种基金Chongqing Key Disease Prevention and Control Technology Project,Grant/Award Number:2019ZX002Chongqing Technology Innovation and Application Development Project,Grant/Award Number:cstc2019jscxfxydX0008。
文摘Background:Most patients with advanced non-small cell lung cancer(NSCLC)have a poor prognosis.Predicting overall survival using clinical data would benefit cancer patients by allowing providers to design an optimum treatment plan.We compared the performance of nomograms with machine-learning models at predicting the overall survival of NSCLC patients.This comparison benefits the development and selection of models during the clinical decision-making process for NSCLC patients.Methods:Multiple machine-learning models were used in a retrospective cohort of 6586 patients.First,we modeled and validated a nomogram to predict the overall survival of NSCLC patients.Subsequently,five machine-learning models(logistic regression,random forest,XGBoost,decision tree,and light gradient boosting machine)were used to predict survival status.Next,we evaluated the performance of the models.Finally,the machine-learning model with the highest accuracy was chosen for comparison with the nomogram at predicting survival status by observing a novel performance measure:time-dependent prediction accuracy.Results:Among the five machine-learning models,the accuracy of random forest model outperformed the others.Compared with the nomogram for time-dependent prediction accuracy with a follow-up time ranging from 12 to 60 months,the prediction accuracies of both the nomogram and machinelearning models changed as time varied.The nomogram reached a maximum prediction accuracy of 0.85 in the 60th month,and the random forest algorithm reached a maximum prediction accuracy of 0.74 in the 13th month.Conclusions:Overall,the nomogram provided more reliable prognostic assessments of NSCLC patients than machine-learning models over our observation period.Although machine-learning methods have been widely adopted for predicting clinical prognoses in recent studies,the conventional nomogram was competitive.In real clinical applications,a comprehensive model that combines these two methods may demonstrate superior capabilities.
基金CAMS Innovation Fund for Medical Sciences(CIFMS),Grant/Award Number:2021-I2M-1-066Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences,Grant/Award Number:2019PT320027+1 种基金Beijing Hope Run Special Fund of Cancer Foundation of China,Grant/Award Number:LC2019A04Fundamental Research Funds for the Central Universities,Grant/Award Number:3332020023。
文摘Background:Kidney cancer originates from the urinary tubule epithelial system of the renal parenchyma,accounting for 20% of all urinary system tumors.Approximately 70% of cases are localized at diagnosis,and 30%are metastatic.Most localized kidney cancers can be cured by surgery,but most metastatic patients relapse after surgery and eventually die of kidney cancer.Therefore,accurately predicting patient survival and identifying high-risk metastatic patients will effectively guide interventions and improve prognosis.Methods:This study used the data of 12,394 kidney cancer patients from the surveillance,epidemiology,and end results database to construct a research cohort related to kidney cancer survival and metastasis.Eight machine learning models(including support vector machines,logistic regression,decision tree,random forest,XGBoost,AdaBoost,K-nearest neighbors,and multilayer perceptron)were developed to predict the survival and metastasis of kidney cancer and six evaluation indicators(accuracy,precision,sensitivity,specificity,F1 score,and area under the receiver operating characteristic[AUROC])were used to verify,evaluate,and optimize the models.Results:Among the eight machine learning models,Logistic Regression has the highest AUROC in both prediction scenarios.For 3-year survival prediction,the Logistic Regression model had an accuracy of 0.684,a sensitivity of 0.702,a specificity of 0.670,a precision of 0.686,an F1 score of 0.683,and an AUROC of 0.741.For tumor metastasis prediction,the Logistic Regression model had an accuracy of 0.800,a sensitivity of 0.540,a specificity of 0.830,a precision of 0.769,an F1 score of 0.772,and an AUROC of 0.804.Conclusion:In this study,we selected appropriate variables from both statistical and clinical significance and developed and compared eight machine learning models for predicting 3-year survival and metastasis of kidney cancer.The prediction results and evaluation results demonstrated that our model could provide decision support for early intervention for kidney cancer patients.
基金National Key Research&Development Program of China。
文摘Cancer informatics has significantly progressed in the big data era.We summarize the application of informatics approaches to the cancer domain from both the informatics perspective(e.g.,data management and data science)and the clinical perspective(e.g.,cancer screening,risk assessment,diagnosis,treatment,and prognosis).We discuss various informatics methods and tools that are widely applied in cancer research and practices,such as cancer databases,data standards,terminologies,high‐throughput omics data mining,machine‐learning algorithms,artificial intelligence imaging,and intelligent radiation.We also address the informatics challenges within the cancer field that pursue better treatment decisions and patient outcomes,and focus on how informatics can provide opportunities for cancer research and practices.Finally,we conclude that the interdisciplinary nature of cancer informatics and collaborations are major drivers for future research and applications in clinical practices.It is hoped that this review is instrumental for cancer researchers and clinicians with its informatics‐specific insights.