BACKGROUND Intensive care unit-acquired weakness(ICU-AW)is a common complication that significantly impacts the patient's recovery process,even leading to adverse outcomes.Currently,there is a lack of effective pr...BACKGROUND Intensive care unit-acquired weakness(ICU-AW)is a common complication that significantly impacts the patient's recovery process,even leading to adverse outcomes.Currently,there is a lack of effective preventive measures.AIM To identify significant risk factors for ICU-AW through iterative machine learning techniques and offer recommendations for its prevention and treatment.METHODS Patients were categorized into ICU-AW and non-ICU-AW groups on the 14th day post-ICU admission.Relevant data from the initial 14 d of ICU stay,such as age,comorbidities,sedative dosage,vasopressor dosage,duration of mechanical ventilation,length of ICU stay,and rehabilitation therapy,were gathered.The relationships between these variables and ICU-AW were examined.Utilizing iterative machine learning techniques,a multilayer perceptron neural network model was developed,and its predictive performance for ICU-AW was assessed using the receiver operating characteristic curve.RESULTS Within the ICU-AW group,age,duration of mechanical ventilation,lorazepam dosage,adrenaline dosage,and length of ICU stay were significantly higher than in the non-ICU-AW group.Additionally,sepsis,multiple organ dysfunction syndrome,hypoalbuminemia,acute heart failure,respiratory failure,acute kidney injury,anemia,stress-related gastrointestinal bleeding,shock,hypertension,coronary artery disease,malignant tumors,and rehabilitation therapy ratios were significantly higher in the ICU-AW group,demonstrating statistical significance.The most influential factors contributing to ICU-AW were identified as the length of ICU stay(100.0%)and the duration of mechanical ventilation(54.9%).The neural network model predicted ICU-AW with an area under the curve of 0.941,sensitivity of 92.2%,and specificity of 82.7%.CONCLUSION The main factors influencing ICU-AW are the length of ICU stay and the duration of mechanical ventilation.A primary preventive strategy,when feasible,involves minimizing both ICU stay and mechanical ventilation duration.展开更多
This work constructed a machine learning(ML)model to predict the atmospheric corrosion rate of low-alloy steels(LAS).The material properties of LAS,environmental factors,and exposure time were used as the input,while ...This work constructed a machine learning(ML)model to predict the atmospheric corrosion rate of low-alloy steels(LAS).The material properties of LAS,environmental factors,and exposure time were used as the input,while the corrosion rate as the output.6 dif-ferent ML algorithms were used to construct the proposed model.Through optimization and filtering,the eXtreme gradient boosting(XG-Boost)model exhibited good corrosion rate prediction accuracy.The features of material properties were then transformed into atomic and physical features using the proposed property transformation approach,and the dominant descriptors that affected the corrosion rate were filtered using the recursive feature elimination(RFE)as well as XGBoost methods.The established ML models exhibited better predic-tion performance and generalization ability via property transformation descriptors.In addition,the SHapley additive exPlanations(SHAP)method was applied to analyze the relationship between the descriptors and corrosion rate.The results showed that the property transformation model could effectively help with analyzing the corrosion behavior,thereby significantly improving the generalization ability of corrosion rate prediction models.展开更多
The high throughput prediction of the thermodynamic phase behavior of active pharmaceutical ingredients(APIs)with pharmaceutically relevant excipients remains a major scientific challenge in the screening of pharmaceu...The high throughput prediction of the thermodynamic phase behavior of active pharmaceutical ingredients(APIs)with pharmaceutically relevant excipients remains a major scientific challenge in the screening of pharmaceutical formulations.In this work,a developed machine-learning model efficiently predicts the solubility of APIs in polymers by learning the phase equilibrium principle and using a few molecular descriptors.Under the few-shot learning framework,thermodynamic theory(perturbed-chain statistical associating fluid theory)was used for data augmentation,and computational chemistry was applied for molecular descriptors'screening.The results showed that the developed machine-learning model can predict the API-polymer phase diagram accurately,broaden the solubility data of APIs in polymers,and reproduce the relationship between API solubility and the interaction mechanisms between API and polymer successfully,which provided efficient guidance for the development of pharmaceutical formulations.展开更多
Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning technique...Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning techniques have emerged as promising tools in stroke medicine,enabling efficient analysis of large-scale datasets and facilitating personalized and precision medicine approaches.This abstract provides a comprehensive overview of machine learning’s applications,challenges,and future directions in stroke medicine.Recently introduced machine learning algorithms have been extensively employed in all the fields of stroke medicine.Machine learning models have demonstrated remarkable accuracy in imaging analysis,diagnosing stroke subtypes,risk stratifications,guiding medical treatment,and predicting patient prognosis.Despite the tremendous potential of machine learning in stroke medicine,several challenges must be addressed.These include the need for standardized and interoperable data collection,robust model validation and generalization,and the ethical considerations surrounding privacy and bias.In addition,integrating machine learning models into clinical workflows and establishing regulatory frameworks are critical for ensuring their widespread adoption and impact in routine stroke care.Machine learning promises to revolutionize stroke medicine by enabling precise diagnosis,tailored treatment selection,and improved prognostication.Continued research and collaboration among clinicians,researchers,and technologists are essential for overcoming challenges and realizing the full potential of machine learning in stroke care,ultimately leading to enhanced patient outcomes and quality of life.This review aims to summarize all the current implications of machine learning in stroke diagnosis,treatment,and prognostic evaluation.At the same time,another purpose of this paper is to explore all the future perspectives these techniques can provide in combating this disabling disease.展开更多
Fires,including wildfires,harm air quality and essential public services like transportation,communication,and utilities.These fires can also influence atmospheric conditions,including temperature and aerosols,potenti...Fires,including wildfires,harm air quality and essential public services like transportation,communication,and utilities.These fires can also influence atmospheric conditions,including temperature and aerosols,potentially affecting severe convective storms.Here,we investigate the remote impacts of fires in the western United States(WUS)on the occurrence of large hail(size:≥2.54 cm)in the central US(CUS)over the 20-year period of 2001–20 using the machine learning(ML),Random Forest(RF),and Extreme Gradient Boosting(XGB)methods.The developed RF and XGB models demonstrate high accuracy(>90%)and F1 scores of up to 0.78 in predicting large hail occurrences when WUS fires and CUS hailstorms coincide,particularly in four states(Wyoming,South Dakota,Nebraska,and Kansas).The key contributing variables identified from both ML models include the meteorological variables in the fire region(temperature and moisture),the westerly wind over the plume transport path,and the fire features(i.e.,the maximum fire power and burned area).The results confirm a linkage between WUS fires and severe weather in the CUS,corroborating the findings of our previous modeling study conducted on case simulations with a detailed physics model.展开更多
The martensitic transformation temperature is the basis for the application of shape memory alloys(SMAs),and the ability to quickly and accurately predict the transformation temperature of SMAs has very important prac...The martensitic transformation temperature is the basis for the application of shape memory alloys(SMAs),and the ability to quickly and accurately predict the transformation temperature of SMAs has very important practical significance.In this work,machine learning(ML)methods were utilized to accelerate the search for shape memory alloys with targeted properties(phase transition temperature).A group of component data was selected to design shape memory alloys using reverse design method from numerous unexplored data.Component modeling and feature modeling were used to predict the phase transition temperature of the shape memory alloys.The experimental results of the shape memory alloys were obtained to verify the effectiveness of the support vector regression(SVR)model.The results show that the machine learning model can obtain target materials more efficiently and pertinently,and realize the accurate and rapid design of shape memory alloys with specific target phase transition temperature.On this basis,the relationship between phase transition temperature and material descriptors is analyzed,and it is proved that the key factors affecting the phase transition temperature of shape memory alloys are based on the strength of the bond energy between atoms.This work provides new ideas for the controllable design and performance optimization of Cu-based shape memory alloys.展开更多
Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary w...Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.展开更多
As some recent information security legislation endowed users with unconditional rights to be forgotten by any trained machine learning model,personalised IoT service pro-viders have to put unlearning functionality in...As some recent information security legislation endowed users with unconditional rights to be forgotten by any trained machine learning model,personalised IoT service pro-viders have to put unlearning functionality into their consideration.The most straight-forward method to unlearn users'contribution is to retrain the model from the initial state,which is not realistic in high throughput applications with frequent unlearning requests.Though some machine unlearning frameworks have been proposed to speed up the retraining process,they fail to match decentralised learning scenarios.A decentralised unlearning framework called heterogeneous decentralised unlearning framework with seed(HDUS)is designed,which uses distilled seed models to construct erasable en-sembles for all clients.Moreover,the framework is compatible with heterogeneous on-device models,representing stronger scalability in real-world applications.Extensive experiments on three real-world datasets show that our HDUS achieves state-of-the-art performance.展开更多
Jet grouting is one of the most popular soil improvement techniques,but its design usually involves great uncertainties that can lead to economic cost overruns in construction projects.The high dispersion in the prope...Jet grouting is one of the most popular soil improvement techniques,but its design usually involves great uncertainties that can lead to economic cost overruns in construction projects.The high dispersion in the properties of the improved material leads to designers assuming a conservative,arbitrary and unjustified strength,which is even sometimes subjected to the results of the test fields.The present paper presents an approach for prediction of the uniaxial compressive strength(UCS)of jet grouting columns based on the analysis of several machine learning algorithms on a database of 854 results mainly collected from different research papers.The selected machine learning model(extremely randomized trees)relates the soil type and various parameters of the technique to the value of the compressive strength.Despite the complex mechanism that surrounds the jet grouting process,evidenced by the high dispersion and low correlation of the variables studied,the trained model allows to optimally predict the values of compressive strength with a significant improvement with respect to the existing works.Consequently,this work proposes for the first time a reliable and easily applicable approach for estimation of the compressive strength of jet grouting columns.展开更多
Cardiovascular disease(CVD)has gradually become one of the main causes of harm to the life and health of residents.Exploring the influencing factors and risk assessment methods of CVD has become a general trend.In thi...Cardiovascular disease(CVD)has gradually become one of the main causes of harm to the life and health of residents.Exploring the influencing factors and risk assessment methods of CVD has become a general trend.In this paper,a machine learning-based decision-making mechanism for risk assessment of CVD is designed.In this mechanism,the logistics regression analysismethod and factor analysismodel are used to select age,obesity degree,blood pressure,blood fat,blood sugar,smoking status,drinking status,and exercise status as the main pathogenic factors of CVD,and an index systemof risk assessment for CVD is established.Then,a two-stage model combining K-means cluster analysis and random forest(RF)is proposed to evaluate and predict the risk of CVD,and the predicted results are compared with the methods of Bayesian discrimination,K-means cluster analysis and RF.The results show that thepredictioneffect of theproposedtwo-stagemodel is better than that of the comparedmethods.Moreover,several suggestions for the government,the medical industry and the public are provided based on the research results.展开更多
The safe and reliable operation of lithium-ion batteries necessitates the accurate prediction of remaining useful life(RUL).However,this task is challenging due to the diverse ageing mechanisms,various operating condi...The safe and reliable operation of lithium-ion batteries necessitates the accurate prediction of remaining useful life(RUL).However,this task is challenging due to the diverse ageing mechanisms,various operating conditions,and limited measured signals.Although data-driven methods are perceived as a promising solution,they ignore intrinsic battery physics,leading to compromised accuracy,low efficiency,and low interpretability.In response,this study integrates domain knowledge into deep learning to enhance the RUL prediction performance.We demonstrate accurate RUL prediction using only a single charging curve.First,a generalisable physics-based model is developed to extract ageing-correlated parameters that can describe and explain battery degradation from battery charging data.The parameters inform a deep neural network(DNN)to predict RUL with high accuracy and efficiency.The trained model is validated under 3 types of batteries working under 7 conditions,considering fully charged and partially charged cases.Using data from one cycle only,the proposed method achieves a root mean squared error(RMSE)of 11.42 cycles and a mean absolute relative error(MARE)of 3.19%on average,which are over45%and 44%lower compared to the two state-of-the-art data-driven methods,respectively.Besides its accuracy,the proposed method also outperforms existing methods in terms of efficiency,input burden,and robustness.The inherent relationship between the model parameters and the battery degradation mechanism is further revealed,substantiating the intrinsic superiority of the proposed method.展开更多
Magnesium(Mg)alloys have shown great prospects as both structural and biomedical materials,while poor corrosion resistance limits their further application.In this work,to avoid the time-consuming and laborious experi...Magnesium(Mg)alloys have shown great prospects as both structural and biomedical materials,while poor corrosion resistance limits their further application.In this work,to avoid the time-consuming and laborious experiment trial,a high-throughput computational strategy based on first-principles calculations is designed for screening corrosion-resistant binary Mg alloy with intermetallics,from both the thermodynamic and kinetic perspectives.The stable binary Mg intermetallics with low equilibrium potential difference with respect to the Mg matrix are firstly identified.Then,the hydrogen adsorption energies on the surfaces of these Mg intermetallics are calculated,and the corrosion exchange current density is further calculated by a hydrogen evolution reaction(HER)kinetic model.Several intermetallics,e.g.Y_(3)Mg,Y_(2)Mg and La_(5)Mg,are identified to be promising intermetallics which might effectively hinder the cathodic HER.Furthermore,machine learning(ML)models are developed to predict Mg intermetallics with proper hydrogen adsorption energy employing work function(W_(f))and weighted first ionization energy(WFIE).The generalization of the ML models is tested on five new binary Mg intermetallics with the average root mean square error(RMSE)of 0.11 eV.This study not only predicts some promising binary Mg intermetallics which may suppress the galvanic corrosion,but also provides a high-throughput screening strategy and ML models for the design of corrosion-resistant alloy,which can be extended to ternary Mg alloys or other alloy systems.展开更多
The advent of pandemics such as COVID-19 significantly impacts human behaviour and lives every day.Therefore,it is essential to make medical services connected to internet,available in every remote location during the...The advent of pandemics such as COVID-19 significantly impacts human behaviour and lives every day.Therefore,it is essential to make medical services connected to internet,available in every remote location during these situations.Also,the security issues in the Internet of Medical Things(IoMT)used in these service,make the situation even more critical because cyberattacks on the medical devices might cause treatment delays or clinical failures.Hence,services in the healthcare ecosystem need rapid,uninterrupted,and secure facilities.The solution provided in this research addresses security concerns and services availability for patients with critical health in remote areas.This research aims to develop an intelligent Software Defined Networks(SDNs)enabled secure framework for IoT healthcare ecosystem.We propose a hybrid of machine learning and deep learning techniques(DNN+SVM)to identify network intrusions in the sensor-based healthcare data.In addition,this system can efficiently monitor connected devices and suspicious behaviours.Finally,we evaluate the performance of our proposed framework using various performance metrics based on the healthcare application scenarios.the experimental results show that the proposed approach effectively detects and mitigates attacks in the SDN-enabled IoT networks and performs better that other state-of-art-approaches.展开更多
The high rate of early recurrence in hepatocellular carcinoma(HCC)post curative surgical intervention poses a substantial clinical hurdle,impacting patient outcomes and complicating postoperative management.The advent...The high rate of early recurrence in hepatocellular carcinoma(HCC)post curative surgical intervention poses a substantial clinical hurdle,impacting patient outcomes and complicating postoperative management.The advent of machine learning provides a unique opportunity to harness vast datasets,identifying subtle patterns and factors that elude conventional prognostic methods.Machine learning models,equipped with the ability to analyse intricate relationships within datasets,have shown promise in predicting outcomes in various medical disciplines.In the context of HCC,the application of machine learning to predict early recurrence holds potential for personalized postoperative care strategies.This editorial comments on the study carried out exploring the merits and efficacy of random survival forests(RSF)in identifying significant risk factors for recurrence,stratifying patients at low and high risk of HCC recurrence and comparing this to traditional COX proportional hazard models(CPH).In doing so,the study demonstrated that the RSF models are superior to traditional CPH models in predicting recurrence of HCC and represent a giant leap towards precision medicine.展开更多
The growing usage of Android smartphones has led to a significant rise in incidents of Android malware andprivacy breaches.This escalating security concern necessitates the development of advanced technologies capable...The growing usage of Android smartphones has led to a significant rise in incidents of Android malware andprivacy breaches.This escalating security concern necessitates the development of advanced technologies capableof automatically detecting andmitigatingmalicious activities in Android applications(apps).Such technologies arecrucial for safeguarding user data and maintaining the integrity of mobile devices in an increasingly digital world.Current methods employed to detect sensitive data leaks in Android apps are hampered by two major limitationsthey require substantial computational resources and are prone to a high frequency of false positives.This meansthat while attempting to identify security breaches,these methods often consume considerable processing powerand mistakenly flag benign activities as malicious,leading to inefficiencies and reduced reliability in malwaredetection.The proposed approach includes a data preprocessing step that removes duplicate samples,managesunbalanced datasets,corrects inconsistencies,and imputes missing values to ensure data accuracy.The Minimaxmethod is then used to normalize numerical data,followed by feature vector extraction using the Gain ratio andChi-squared test to identify and extract the most significant characteristics using an appropriate prediction model.This study focuses on extracting a subset of attributes best suited for the task and recommending a predictivemodel based on domain expert opinion.The proposed method is evaluated using Drebin and TUANDROMDdatasets containing 15,036 and 4,464 benign and malicious samples,respectively.The empirical result shows thatthe RandomForest(RF)and Support VectorMachine(SVC)classifiers achieved impressive accuracy rates of 98.9%and 98.8%,respectively,in detecting unknown Androidmalware.A sensitivity analysis experiment was also carriedout on all three ML-based classifiers based on MAE,MSE,R2,and sensitivity parameters,resulting in a flawlessperformance for both datasets.This approach has substantial potential for real-world applications and can serve asa valuable tool for preventing the spread of Androidmalware and enhancing mobile device security.展开更多
BACKGROUND Liver transplantation(LT)is a life-saving intervention for patients with end-stage liver disease.However,the equitable allocation of scarce donor organs remains a formidable challenge.Prognostic tools are p...BACKGROUND Liver transplantation(LT)is a life-saving intervention for patients with end-stage liver disease.However,the equitable allocation of scarce donor organs remains a formidable challenge.Prognostic tools are pivotal in identifying the most suitable transplant candidates.Traditionally,scoring systems like the model for end-stage liver disease have been instrumental in this process.Nevertheless,the landscape of prognostication is undergoing a transformation with the integration of machine learning(ML)and artificial intelligence models.AIM To assess the utility of ML models in prognostication for LT,comparing their performance and reliability to established traditional scoring systems.METHODS Following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines,we conducted a thorough and standardized literature search using the PubMed/MEDLINE database.Our search imposed no restrictions on publication year,age,or gender.Exclusion criteria encompassed non-English studies,review articles,case reports,conference papers,studies with missing data,or those exhibiting evident methodological flaws.RESULTS Our search yielded a total of 64 articles,with 23 meeting the inclusion criteria.Among the selected studies,60.8%originated from the United States and China combined.Only one pediatric study met the criteria.Notably,91%of the studies were published within the past five years.ML models consistently demonstrated satisfactory to excellent area under the receiver operating characteristic curve values(ranging from 0.6 to 1)across all studies,surpassing the performance of traditional scoring systems.Random forest exhibited superior predictive capabilities for 90-d mortality following LT,sepsis,and acute kidney injury(AKI).In contrast,gradient boosting excelled in predicting the risk of graft-versus-host disease,pneumonia,and AKI.CONCLUSION This study underscores the potential of ML models in guiding decisions related to allograft allocation and LT,marking a significant evolution in the field of prognostication.展开更多
The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotatio...The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.展开更多
Monitoring seismicity in real time provides significant benefits for timely earthquake warning and analyses.In this study,we propose an automatic workflow based on machine learning(ML)to monitor seismicity in the sout...Monitoring seismicity in real time provides significant benefits for timely earthquake warning and analyses.In this study,we propose an automatic workflow based on machine learning(ML)to monitor seismicity in the southern Sichuan Basin of China.This workflow includes coherent event detection,phase picking,and earthquake location using three-component data from a seismic network.By combining Phase Net,we develop an ML-based earthquake location model called Phase Loc,to conduct real-time monitoring of the local seismicity.The approach allows us to use synthetic samples covering the entire study area to train Phase Loc,addressing the problems of insufficient data samples,imbalanced data distribution,and unreliable labels when training with observed data.We apply the trained model to observed data recorded in the southern Sichuan Basin,China,between September 2018 and March 2019.The results show that the average differences in latitude,longitude,and depth are 5.7 km,6.1 km,and 2 km,respectively,compared to the reference catalog.Phase Loc combines all available phase information to make fast and reliable predictions,even if only a few phases are detected and picked.The proposed workflow may help real-time seismic monitoring in other regions as well.展开更多
BACKGROUND The study on predicting the differentiation grade of colorectal cancer(CRC)based on magnetic resonance imaging(MRI)has not been reported yet.Developing a non-invasive model to predict the differentiation gr...BACKGROUND The study on predicting the differentiation grade of colorectal cancer(CRC)based on magnetic resonance imaging(MRI)has not been reported yet.Developing a non-invasive model to predict the differentiation grade of CRC is of great value.AIM To develop and validate machine learning-based models for predicting the differ-entiation grade of CRC based on T2-weighted images(T2WI).METHODS We retrospectively collected the preoperative imaging and clinical data of 315 patients with CRC who underwent surgery from March 2018 to July 2023.Patients were randomly assigned to a training cohort(n=220)or a validation cohort(n=95)at a 7:3 ratio.Lesions were delineated layer by layer on high-resolution T2WI.Least absolute shrinkage and selection operator regression was applied to screen for radiomic features.Radiomics and clinical models were constructed using the multilayer perceptron(MLP)algorithm.These radiomic features and clinically relevant variables(selected based on a significance level of P<0.05 in the training set)were used to construct radiomics-clinical models.The performance of the three models(clinical,radiomic,and radiomic-clinical model)were evaluated using the area under the curve(AUC),calibration curve and decision curve analysis(DCA).RESULTS After feature selection,eight radiomic features were retained from the initial 1781 features to construct the radiomic model.Eight different classifiers,including logistic regression,support vector machine,k-nearest neighbours,random forest,extreme trees,extreme gradient boosting,light gradient boosting machine,and MLP,were used to construct the model,with MLP demonstrating the best diagnostic performance.The AUC of the radiomic-clinical model was 0.862(95%CI:0.796-0.927)in the training cohort and 0.761(95%CI:0.635-0.887)in the validation cohort.The AUC for the radiomic model was 0.796(95%CI:0.723-0.869)in the training cohort and 0.735(95%CI:0.604-0.866)in the validation cohort.The clinical model achieved an AUC of 0.751(95%CI:0.661-0.842)in the training cohort and 0.676(95%CI:0.525-0.827)in the validation cohort.All three models demonstrated good accuracy.In the training cohort,the AUC of the radiomic-clinical model was significantly greater than that of the clinical model(P=0.005)and the radiomic model(P=0.016).DCA confirmed the clinical practicality of incorporating radiomic features into the diagnostic process.CONCLUSION In this study,we successfully developed and validated a T2WI-based machine learning model as an auxiliary tool for the preoperative differentiation between well/moderately and poorly differentiated CRC.This novel approach may assist clinicians in personalizing treatment strategies for patients and improving treatment efficacy.展开更多
基金Supported by Science and Technology Support Program of Qiandongnan Prefecture,No.Qiandongnan Sci-Tech Support[2021]12Guizhou Province High-Level Innovative Talent Training Program,No.Qiannan Thousand Talents[2022]201701.
文摘BACKGROUND Intensive care unit-acquired weakness(ICU-AW)is a common complication that significantly impacts the patient's recovery process,even leading to adverse outcomes.Currently,there is a lack of effective preventive measures.AIM To identify significant risk factors for ICU-AW through iterative machine learning techniques and offer recommendations for its prevention and treatment.METHODS Patients were categorized into ICU-AW and non-ICU-AW groups on the 14th day post-ICU admission.Relevant data from the initial 14 d of ICU stay,such as age,comorbidities,sedative dosage,vasopressor dosage,duration of mechanical ventilation,length of ICU stay,and rehabilitation therapy,were gathered.The relationships between these variables and ICU-AW were examined.Utilizing iterative machine learning techniques,a multilayer perceptron neural network model was developed,and its predictive performance for ICU-AW was assessed using the receiver operating characteristic curve.RESULTS Within the ICU-AW group,age,duration of mechanical ventilation,lorazepam dosage,adrenaline dosage,and length of ICU stay were significantly higher than in the non-ICU-AW group.Additionally,sepsis,multiple organ dysfunction syndrome,hypoalbuminemia,acute heart failure,respiratory failure,acute kidney injury,anemia,stress-related gastrointestinal bleeding,shock,hypertension,coronary artery disease,malignant tumors,and rehabilitation therapy ratios were significantly higher in the ICU-AW group,demonstrating statistical significance.The most influential factors contributing to ICU-AW were identified as the length of ICU stay(100.0%)and the duration of mechanical ventilation(54.9%).The neural network model predicted ICU-AW with an area under the curve of 0.941,sensitivity of 92.2%,and specificity of 82.7%.CONCLUSION The main factors influencing ICU-AW are the length of ICU stay and the duration of mechanical ventilation.A primary preventive strategy,when feasible,involves minimizing both ICU stay and mechanical ventilation duration.
基金the National Key R&D Program of China(No.2021YFB3701705).
文摘This work constructed a machine learning(ML)model to predict the atmospheric corrosion rate of low-alloy steels(LAS).The material properties of LAS,environmental factors,and exposure time were used as the input,while the corrosion rate as the output.6 dif-ferent ML algorithms were used to construct the proposed model.Through optimization and filtering,the eXtreme gradient boosting(XG-Boost)model exhibited good corrosion rate prediction accuracy.The features of material properties were then transformed into atomic and physical features using the proposed property transformation approach,and the dominant descriptors that affected the corrosion rate were filtered using the recursive feature elimination(RFE)as well as XGBoost methods.The established ML models exhibited better predic-tion performance and generalization ability via property transformation descriptors.In addition,the SHapley additive exPlanations(SHAP)method was applied to analyze the relationship between the descriptors and corrosion rate.The results showed that the property transformation model could effectively help with analyzing the corrosion behavior,thereby significantly improving the generalization ability of corrosion rate prediction models.
基金the financial support from the National Natural Science Foundation of China(22278070,21978047,21776046)。
文摘The high throughput prediction of the thermodynamic phase behavior of active pharmaceutical ingredients(APIs)with pharmaceutically relevant excipients remains a major scientific challenge in the screening of pharmaceutical formulations.In this work,a developed machine-learning model efficiently predicts the solubility of APIs in polymers by learning the phase equilibrium principle and using a few molecular descriptors.Under the few-shot learning framework,thermodynamic theory(perturbed-chain statistical associating fluid theory)was used for data augmentation,and computational chemistry was applied for molecular descriptors'screening.The results showed that the developed machine-learning model can predict the API-polymer phase diagram accurately,broaden the solubility data of APIs in polymers,and reproduce the relationship between API solubility and the interaction mechanisms between API and polymer successfully,which provided efficient guidance for the development of pharmaceutical formulations.
文摘Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning techniques have emerged as promising tools in stroke medicine,enabling efficient analysis of large-scale datasets and facilitating personalized and precision medicine approaches.This abstract provides a comprehensive overview of machine learning’s applications,challenges,and future directions in stroke medicine.Recently introduced machine learning algorithms have been extensively employed in all the fields of stroke medicine.Machine learning models have demonstrated remarkable accuracy in imaging analysis,diagnosing stroke subtypes,risk stratifications,guiding medical treatment,and predicting patient prognosis.Despite the tremendous potential of machine learning in stroke medicine,several challenges must be addressed.These include the need for standardized and interoperable data collection,robust model validation and generalization,and the ethical considerations surrounding privacy and bias.In addition,integrating machine learning models into clinical workflows and establishing regulatory frameworks are critical for ensuring their widespread adoption and impact in routine stroke care.Machine learning promises to revolutionize stroke medicine by enabling precise diagnosis,tailored treatment selection,and improved prognostication.Continued research and collaboration among clinicians,researchers,and technologists are essential for overcoming challenges and realizing the full potential of machine learning in stroke care,ultimately leading to enhanced patient outcomes and quality of life.This review aims to summarize all the current implications of machine learning in stroke diagnosis,treatment,and prognostic evaluation.At the same time,another purpose of this paper is to explore all the future perspectives these techniques can provide in combating this disabling disease.
基金supported by the U.S.Department of Energy,Office of Science,Office of Biological and Environmental Research program as part of the Regional and Global Model Analysis and Multi-Sector Dynamics program areas(Award Number DE-SC0016605)Argonne National Laboratory is operated for the DOE by UChicago Argonne,LLC,under contract DE-AC02-06CH11357+1 种基金the National Energy Research Scientific Computing Center(NERSC)NERSC is a U.S.DOE Office of Science User Facility operated under Contract DE-AC02-05CH11231.
文摘Fires,including wildfires,harm air quality and essential public services like transportation,communication,and utilities.These fires can also influence atmospheric conditions,including temperature and aerosols,potentially affecting severe convective storms.Here,we investigate the remote impacts of fires in the western United States(WUS)on the occurrence of large hail(size:≥2.54 cm)in the central US(CUS)over the 20-year period of 2001–20 using the machine learning(ML),Random Forest(RF),and Extreme Gradient Boosting(XGB)methods.The developed RF and XGB models demonstrate high accuracy(>90%)and F1 scores of up to 0.78 in predicting large hail occurrences when WUS fires and CUS hailstorms coincide,particularly in four states(Wyoming,South Dakota,Nebraska,and Kansas).The key contributing variables identified from both ML models include the meteorological variables in the fire region(temperature and moisture),the westerly wind over the plume transport path,and the fire features(i.e.,the maximum fire power and burned area).The results confirm a linkage between WUS fires and severe weather in the CUS,corroborating the findings of our previous modeling study conducted on case simulations with a detailed physics model.
基金financially supported by the National Natural Science Foundation of China(No.51974028)。
文摘The martensitic transformation temperature is the basis for the application of shape memory alloys(SMAs),and the ability to quickly and accurately predict the transformation temperature of SMAs has very important practical significance.In this work,machine learning(ML)methods were utilized to accelerate the search for shape memory alloys with targeted properties(phase transition temperature).A group of component data was selected to design shape memory alloys using reverse design method from numerous unexplored data.Component modeling and feature modeling were used to predict the phase transition temperature of the shape memory alloys.The experimental results of the shape memory alloys were obtained to verify the effectiveness of the support vector regression(SVR)model.The results show that the machine learning model can obtain target materials more efficiently and pertinently,and realize the accurate and rapid design of shape memory alloys with specific target phase transition temperature.On this basis,the relationship between phase transition temperature and material descriptors is analyzed,and it is proved that the key factors affecting the phase transition temperature of shape memory alloys are based on the strength of the bond energy between atoms.This work provides new ideas for the controllable design and performance optimization of Cu-based shape memory alloys.
基金Korea Institute of Energy Technology Evaluation and Planning(KETEP)grant funded by the Korea government(Grant No.20214000000140,Graduate School of Convergence for Clean Energy Integrated Power Generation)Korea Basic Science Institute(National Research Facilities and Equipment Center)grant funded by the Ministry of Education(2021R1A6C101A449)the National Research Foundation of Korea grant funded by the Ministry of Science and ICT(2021R1A2C1095139),Republic of Korea。
文摘Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.
基金Australian Research Council,Grant/Award Numbers:FT210100624,DP190101985,DE230101033。
文摘As some recent information security legislation endowed users with unconditional rights to be forgotten by any trained machine learning model,personalised IoT service pro-viders have to put unlearning functionality into their consideration.The most straight-forward method to unlearn users'contribution is to retrain the model from the initial state,which is not realistic in high throughput applications with frequent unlearning requests.Though some machine unlearning frameworks have been proposed to speed up the retraining process,they fail to match decentralised learning scenarios.A decentralised unlearning framework called heterogeneous decentralised unlearning framework with seed(HDUS)is designed,which uses distilled seed models to construct erasable en-sembles for all clients.Moreover,the framework is compatible with heterogeneous on-device models,representing stronger scalability in real-world applications.Extensive experiments on three real-world datasets show that our HDUS achieves state-of-the-art performance.
基金This work has been supported by the Conselleria de Inno-vación,Universidades,Ciencia y Sociedad Digital de la Generalitat Valenciana(CIAICO/2021/335).
文摘Jet grouting is one of the most popular soil improvement techniques,but its design usually involves great uncertainties that can lead to economic cost overruns in construction projects.The high dispersion in the properties of the improved material leads to designers assuming a conservative,arbitrary and unjustified strength,which is even sometimes subjected to the results of the test fields.The present paper presents an approach for prediction of the uniaxial compressive strength(UCS)of jet grouting columns based on the analysis of several machine learning algorithms on a database of 854 results mainly collected from different research papers.The selected machine learning model(extremely randomized trees)relates the soil type and various parameters of the technique to the value of the compressive strength.Despite the complex mechanism that surrounds the jet grouting process,evidenced by the high dispersion and low correlation of the variables studied,the trained model allows to optimally predict the values of compressive strength with a significant improvement with respect to the existing works.Consequently,this work proposes for the first time a reliable and easily applicable approach for estimation of the compressive strength of jet grouting columns.
基金This work is supported by the National Natural Science Foundation of China(Nos.72071150,71871174).
文摘Cardiovascular disease(CVD)has gradually become one of the main causes of harm to the life and health of residents.Exploring the influencing factors and risk assessment methods of CVD has become a general trend.In this paper,a machine learning-based decision-making mechanism for risk assessment of CVD is designed.In this mechanism,the logistics regression analysismethod and factor analysismodel are used to select age,obesity degree,blood pressure,blood fat,blood sugar,smoking status,drinking status,and exercise status as the main pathogenic factors of CVD,and an index systemof risk assessment for CVD is established.Then,a two-stage model combining K-means cluster analysis and random forest(RF)is proposed to evaluate and predict the risk of CVD,and the predicted results are compared with the methods of Bayesian discrimination,K-means cluster analysis and RF.The results show that thepredictioneffect of theproposedtwo-stagemodel is better than that of the comparedmethods.Moreover,several suggestions for the government,the medical industry and the public are provided based on the research results.
基金the financial support from the National Natural Science Foundation of China(52207229)the financial support from the China Scholarship Council(202207550010)。
文摘The safe and reliable operation of lithium-ion batteries necessitates the accurate prediction of remaining useful life(RUL).However,this task is challenging due to the diverse ageing mechanisms,various operating conditions,and limited measured signals.Although data-driven methods are perceived as a promising solution,they ignore intrinsic battery physics,leading to compromised accuracy,low efficiency,and low interpretability.In response,this study integrates domain knowledge into deep learning to enhance the RUL prediction performance.We demonstrate accurate RUL prediction using only a single charging curve.First,a generalisable physics-based model is developed to extract ageing-correlated parameters that can describe and explain battery degradation from battery charging data.The parameters inform a deep neural network(DNN)to predict RUL with high accuracy and efficiency.The trained model is validated under 3 types of batteries working under 7 conditions,considering fully charged and partially charged cases.Using data from one cycle only,the proposed method achieves a root mean squared error(RMSE)of 11.42 cycles and a mean absolute relative error(MARE)of 3.19%on average,which are over45%and 44%lower compared to the two state-of-the-art data-driven methods,respectively.Besides its accuracy,the proposed method also outperforms existing methods in terms of efficiency,input burden,and robustness.The inherent relationship between the model parameters and the battery degradation mechanism is further revealed,substantiating the intrinsic superiority of the proposed method.
基金financially supported by the National Key Research and Development Program of China(No.2016YFB0701202,No.2017YFB0701500 and No.2020YFB1505901)National Natural Science Foundation of China(General Program No.51474149,52072240)+3 种基金Shanghai Science and Technology Committee(No.18511109300)Science and Technology Commission of the CMC(2019JCJQZD27300)financial support from the University of Michigan and Shanghai Jiao Tong University joint funding,China(AE604401)Science and Technology Commission of Shanghai Municipality(No.18511109302).
文摘Magnesium(Mg)alloys have shown great prospects as both structural and biomedical materials,while poor corrosion resistance limits their further application.In this work,to avoid the time-consuming and laborious experiment trial,a high-throughput computational strategy based on first-principles calculations is designed for screening corrosion-resistant binary Mg alloy with intermetallics,from both the thermodynamic and kinetic perspectives.The stable binary Mg intermetallics with low equilibrium potential difference with respect to the Mg matrix are firstly identified.Then,the hydrogen adsorption energies on the surfaces of these Mg intermetallics are calculated,and the corrosion exchange current density is further calculated by a hydrogen evolution reaction(HER)kinetic model.Several intermetallics,e.g.Y_(3)Mg,Y_(2)Mg and La_(5)Mg,are identified to be promising intermetallics which might effectively hinder the cathodic HER.Furthermore,machine learning(ML)models are developed to predict Mg intermetallics with proper hydrogen adsorption energy employing work function(W_(f))and weighted first ionization energy(WFIE).The generalization of the ML models is tested on five new binary Mg intermetallics with the average root mean square error(RMSE)of 0.11 eV.This study not only predicts some promising binary Mg intermetallics which may suppress the galvanic corrosion,but also provides a high-throughput screening strategy and ML models for the design of corrosion-resistant alloy,which can be extended to ternary Mg alloys or other alloy systems.
文摘The advent of pandemics such as COVID-19 significantly impacts human behaviour and lives every day.Therefore,it is essential to make medical services connected to internet,available in every remote location during these situations.Also,the security issues in the Internet of Medical Things(IoMT)used in these service,make the situation even more critical because cyberattacks on the medical devices might cause treatment delays or clinical failures.Hence,services in the healthcare ecosystem need rapid,uninterrupted,and secure facilities.The solution provided in this research addresses security concerns and services availability for patients with critical health in remote areas.This research aims to develop an intelligent Software Defined Networks(SDNs)enabled secure framework for IoT healthcare ecosystem.We propose a hybrid of machine learning and deep learning techniques(DNN+SVM)to identify network intrusions in the sensor-based healthcare data.In addition,this system can efficiently monitor connected devices and suspicious behaviours.Finally,we evaluate the performance of our proposed framework using various performance metrics based on the healthcare application scenarios.the experimental results show that the proposed approach effectively detects and mitigates attacks in the SDN-enabled IoT networks and performs better that other state-of-art-approaches.
文摘The high rate of early recurrence in hepatocellular carcinoma(HCC)post curative surgical intervention poses a substantial clinical hurdle,impacting patient outcomes and complicating postoperative management.The advent of machine learning provides a unique opportunity to harness vast datasets,identifying subtle patterns and factors that elude conventional prognostic methods.Machine learning models,equipped with the ability to analyse intricate relationships within datasets,have shown promise in predicting outcomes in various medical disciplines.In the context of HCC,the application of machine learning to predict early recurrence holds potential for personalized postoperative care strategies.This editorial comments on the study carried out exploring the merits and efficacy of random survival forests(RSF)in identifying significant risk factors for recurrence,stratifying patients at low and high risk of HCC recurrence and comparing this to traditional COX proportional hazard models(CPH).In doing so,the study demonstrated that the RSF models are superior to traditional CPH models in predicting recurrence of HCC and represent a giant leap towards precision medicine.
基金Princess Nourah bint Abdulrahman University and Researchers Supporting Project Number(PNURSP2024R346)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The growing usage of Android smartphones has led to a significant rise in incidents of Android malware andprivacy breaches.This escalating security concern necessitates the development of advanced technologies capableof automatically detecting andmitigatingmalicious activities in Android applications(apps).Such technologies arecrucial for safeguarding user data and maintaining the integrity of mobile devices in an increasingly digital world.Current methods employed to detect sensitive data leaks in Android apps are hampered by two major limitationsthey require substantial computational resources and are prone to a high frequency of false positives.This meansthat while attempting to identify security breaches,these methods often consume considerable processing powerand mistakenly flag benign activities as malicious,leading to inefficiencies and reduced reliability in malwaredetection.The proposed approach includes a data preprocessing step that removes duplicate samples,managesunbalanced datasets,corrects inconsistencies,and imputes missing values to ensure data accuracy.The Minimaxmethod is then used to normalize numerical data,followed by feature vector extraction using the Gain ratio andChi-squared test to identify and extract the most significant characteristics using an appropriate prediction model.This study focuses on extracting a subset of attributes best suited for the task and recommending a predictivemodel based on domain expert opinion.The proposed method is evaluated using Drebin and TUANDROMDdatasets containing 15,036 and 4,464 benign and malicious samples,respectively.The empirical result shows thatthe RandomForest(RF)and Support VectorMachine(SVC)classifiers achieved impressive accuracy rates of 98.9%and 98.8%,respectively,in detecting unknown Androidmalware.A sensitivity analysis experiment was also carriedout on all three ML-based classifiers based on MAE,MSE,R2,and sensitivity parameters,resulting in a flawlessperformance for both datasets.This approach has substantial potential for real-world applications and can serve asa valuable tool for preventing the spread of Androidmalware and enhancing mobile device security.
文摘BACKGROUND Liver transplantation(LT)is a life-saving intervention for patients with end-stage liver disease.However,the equitable allocation of scarce donor organs remains a formidable challenge.Prognostic tools are pivotal in identifying the most suitable transplant candidates.Traditionally,scoring systems like the model for end-stage liver disease have been instrumental in this process.Nevertheless,the landscape of prognostication is undergoing a transformation with the integration of machine learning(ML)and artificial intelligence models.AIM To assess the utility of ML models in prognostication for LT,comparing their performance and reliability to established traditional scoring systems.METHODS Following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines,we conducted a thorough and standardized literature search using the PubMed/MEDLINE database.Our search imposed no restrictions on publication year,age,or gender.Exclusion criteria encompassed non-English studies,review articles,case reports,conference papers,studies with missing data,or those exhibiting evident methodological flaws.RESULTS Our search yielded a total of 64 articles,with 23 meeting the inclusion criteria.Among the selected studies,60.8%originated from the United States and China combined.Only one pediatric study met the criteria.Notably,91%of the studies were published within the past five years.ML models consistently demonstrated satisfactory to excellent area under the receiver operating characteristic curve values(ranging from 0.6 to 1)across all studies,surpassing the performance of traditional scoring systems.Random forest exhibited superior predictive capabilities for 90-d mortality following LT,sepsis,and acute kidney injury(AKI).In contrast,gradient boosting excelled in predicting the risk of graft-versus-host disease,pneumonia,and AKI.CONCLUSION This study underscores the potential of ML models in guiding decisions related to allograft allocation and LT,marking a significant evolution in the field of prognostication.
基金the National Key R&D Program of China(2022YFB3402100)the National Science Fund for Distinguished Young Scholars of China(52025056)+4 种基金the National Natural Science Foundation of China(52305129)the China Postdoctoral Science Foundation(2023M732789)the China Postdoctoral Innovative Talents Support Program(BX20230290)the Open Foundation of Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment(2022JXKF JJ01)the Fundamental Research Funds for Central Universities。
文摘The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.
基金the financial support of the National Key R&D Program of China(2021YFC3000701)the China Seismic Experimental Site in Sichuan-Yunnan(CSES-SY)。
文摘Monitoring seismicity in real time provides significant benefits for timely earthquake warning and analyses.In this study,we propose an automatic workflow based on machine learning(ML)to monitor seismicity in the southern Sichuan Basin of China.This workflow includes coherent event detection,phase picking,and earthquake location using three-component data from a seismic network.By combining Phase Net,we develop an ML-based earthquake location model called Phase Loc,to conduct real-time monitoring of the local seismicity.The approach allows us to use synthetic samples covering the entire study area to train Phase Loc,addressing the problems of insufficient data samples,imbalanced data distribution,and unreliable labels when training with observed data.We apply the trained model to observed data recorded in the southern Sichuan Basin,China,between September 2018 and March 2019.The results show that the average differences in latitude,longitude,and depth are 5.7 km,6.1 km,and 2 km,respectively,compared to the reference catalog.Phase Loc combines all available phase information to make fast and reliable predictions,even if only a few phases are detected and picked.The proposed workflow may help real-time seismic monitoring in other regions as well.
基金the Fujian Province Clinical Key Specialty Construction Project,No.2022884Quanzhou Science and Technology Plan Project,No.2021N034S+1 种基金The Youth Research Project of Fujian Provincial Health Commission,No.2022QNA067Malignant Tumor Clinical Medicine Research Center,No.2020N090s.
文摘BACKGROUND The study on predicting the differentiation grade of colorectal cancer(CRC)based on magnetic resonance imaging(MRI)has not been reported yet.Developing a non-invasive model to predict the differentiation grade of CRC is of great value.AIM To develop and validate machine learning-based models for predicting the differ-entiation grade of CRC based on T2-weighted images(T2WI).METHODS We retrospectively collected the preoperative imaging and clinical data of 315 patients with CRC who underwent surgery from March 2018 to July 2023.Patients were randomly assigned to a training cohort(n=220)or a validation cohort(n=95)at a 7:3 ratio.Lesions were delineated layer by layer on high-resolution T2WI.Least absolute shrinkage and selection operator regression was applied to screen for radiomic features.Radiomics and clinical models were constructed using the multilayer perceptron(MLP)algorithm.These radiomic features and clinically relevant variables(selected based on a significance level of P<0.05 in the training set)were used to construct radiomics-clinical models.The performance of the three models(clinical,radiomic,and radiomic-clinical model)were evaluated using the area under the curve(AUC),calibration curve and decision curve analysis(DCA).RESULTS After feature selection,eight radiomic features were retained from the initial 1781 features to construct the radiomic model.Eight different classifiers,including logistic regression,support vector machine,k-nearest neighbours,random forest,extreme trees,extreme gradient boosting,light gradient boosting machine,and MLP,were used to construct the model,with MLP demonstrating the best diagnostic performance.The AUC of the radiomic-clinical model was 0.862(95%CI:0.796-0.927)in the training cohort and 0.761(95%CI:0.635-0.887)in the validation cohort.The AUC for the radiomic model was 0.796(95%CI:0.723-0.869)in the training cohort and 0.735(95%CI:0.604-0.866)in the validation cohort.The clinical model achieved an AUC of 0.751(95%CI:0.661-0.842)in the training cohort and 0.676(95%CI:0.525-0.827)in the validation cohort.All three models demonstrated good accuracy.In the training cohort,the AUC of the radiomic-clinical model was significantly greater than that of the clinical model(P=0.005)and the radiomic model(P=0.016).DCA confirmed the clinical practicality of incorporating radiomic features into the diagnostic process.CONCLUSION In this study,we successfully developed and validated a T2WI-based machine learning model as an auxiliary tool for the preoperative differentiation between well/moderately and poorly differentiated CRC.This novel approach may assist clinicians in personalizing treatment strategies for patients and improving treatment efficacy.