The rapid growth of machine learning(ML)across fields has intensified the challenge of selecting the right algorithm for specific tasks,known as the Algorithm Selection Problem(ASP).Traditional trial-and-error methods...The rapid growth of machine learning(ML)across fields has intensified the challenge of selecting the right algorithm for specific tasks,known as the Algorithm Selection Problem(ASP).Traditional trial-and-error methods have become impractical due to their resource demands.Automated Machine Learning(AutoML)systems automate this process,but often neglect the group structures and sparsity in meta-features,leading to inefficiencies in algorithm recommendations for classification tasks.This paper proposes a meta-learning approach using Multivariate Sparse Group Lasso(MSGL)to address these limitations.Our method models both within-group and across-group sparsity among meta-features to manage high-dimensional data and reduce multicollinearity across eight meta-feature groups.The Fast Iterative Shrinkage-Thresholding Algorithm(FISTA)with adaptive restart efficiently solves the non-smooth optimization problem.Empirical validation on 145 classification datasets with 17 classification algorithms shows that our meta-learning method outperforms four state-of-the-art approaches,achieving 77.18%classification accuracy,86.07%recommendation accuracy and 88.83%normalized discounted cumulative gain.展开更多
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
Determination of Shear Bond strength(SBS)at interlayer of double-layer asphalt concrete is crucial in flexible pavement structures.The study used three Machine Learning(ML)models,including K-Nearest Neighbors(KNN),Ext...Determination of Shear Bond strength(SBS)at interlayer of double-layer asphalt concrete is crucial in flexible pavement structures.The study used three Machine Learning(ML)models,including K-Nearest Neighbors(KNN),Extra Trees(ET),and Light Gradient Boosting Machine(LGBM),to predict SBS based on easily determinable input parameters.Also,the Grid Search technique was employed for hyper-parameter tuning of the ML models,and cross-validation and learning curve analysis were used for training the models.The models were built on a database of 240 experimental results and three input variables:temperature,normal pressure,and tack coat rate.Model validation was performed using three statistical criteria:the coefficient of determination(R2),the Root Mean Square Error(RMSE),and the mean absolute error(MAE).Additionally,SHAP analysis was also used to validate the importance of the input variables in the prediction of the SBS.Results show that these models accurately predict SBS,with LGBM providing outstanding performance.SHAP(Shapley Additive explanation)analysis for LGBM indicates that temperature is the most influential factor on SBS.Consequently,the proposed ML models can quickly and accurately predict SBS between two layers of asphalt concrete,serving practical applications in flexible pavement structure design.展开更多
The high porosity and tunable chemical functionality of metal-organic frameworks(MOFs)make it a promising catalyst design platform.High-throughput screening of catalytic performance is feasible since the large MOF str...The high porosity and tunable chemical functionality of metal-organic frameworks(MOFs)make it a promising catalyst design platform.High-throughput screening of catalytic performance is feasible since the large MOF structure database is available.In this study,we report a machine learning model for high-throughput screening of MOF catalysts for the CO_(2) cycloaddition reaction.The descriptors for model training were judiciously chosen according to the reaction mechanism,which leads to high accuracy up to 97%for the 75%quantile of the training set as the classification criterion.The feature contribution was further evaluated with SHAP and PDP analysis to provide a certain physical understanding.12,415 hypothetical MOF structures and 100 reported MOFs were evaluated under 100℃ and 1 bar within one day using the model,and 239 potentially efficient catalysts were discovered.Among them,MOF-76(Y)achieved the top performance experimentally among reported MOFs,in good agreement with the prediction.展开更多
In engineering practice,it is often necessary to determine functional relationships between dependent and independent variables.These relationships can be highly nonlinear,and classical regression approaches cannot al...In engineering practice,it is often necessary to determine functional relationships between dependent and independent variables.These relationships can be highly nonlinear,and classical regression approaches cannot always provide sufficiently reliable solutions.Nevertheless,Machine Learning(ML)techniques,which offer advanced regression tools to address complicated engineering issues,have been developed and widely explored.This study investigates the selected ML techniques to evaluate their suitability for application in the hot deformation behavior of metallic materials.The ML-based regression methods of Artificial Neural Networks(ANNs),Support Vector Machine(SVM),Decision Tree Regression(DTR),and Gaussian Process Regression(GPR)are applied to mathematically describe hot flow stress curve datasets acquired experimentally for a medium-carbon steel.Although the GPR method has not been used for such a regression task before,the results showed that its performance is the most favorable and practically unrivaled;neither the ANN method nor the other studied ML techniques provide such precise results of the solved regression analysis.展开更多
In order to study the characteristics of pure fly ash-based geopolymer concrete(PFGC)conveniently,we used a machine learning method that can quantify the perception of characteristics to predict its compressive streng...In order to study the characteristics of pure fly ash-based geopolymer concrete(PFGC)conveniently,we used a machine learning method that can quantify the perception of characteristics to predict its compressive strength.In this study,505 groups of data were collected,and a new database of compressive strength of PFGC was constructed.In order to establish an accurate prediction model of compressive strength,five different types of machine learning networks were used for comparative analysis.The five machine learning models all showed good compressive strength prediction performance on PFGC.Among them,R2,MSE,RMSE and MAE of decision tree model(DT)are 0.99,1.58,1.25,and 0.25,respectively.While R2,MSE,RMSE and MAE of random forest model(RF)are 0.97,5.17,2.27 and 1.38,respectively.The two models have high prediction accuracy and outstanding generalization ability.In order to enhance the interpretability of model decision-making,we used importance ranking to obtain the perception of machine learning model to 13 variables.These 13 variables include chemical composition of fly ash(SiO_(2)/Al_(2)O_(3),Si/Al),the ratio of alkaline liquid to the binder,curing temperature,curing durations inside oven,fly ash dosage,fine aggregate dosage,coarse aggregate dosage,extra water dosage and sodium hydroxide dosage.Curing temperature,specimen ages and curing durations inside oven have the greatest influence on the prediction results,indicating that curing conditions have more prominent influence on the compressive strength of PFGC than ordinary Portland cement concrete.The importance of curing conditions of PFGC even exceeds that of the concrete mix proportion,due to the low reactivity of pure fly ash.展开更多
Diabetic retinopathy(DR)remains a leading cause of vision impairment and blindness among individuals with diabetes,necessitating innovative approaches to screening and management.This editorial explores the transforma...Diabetic retinopathy(DR)remains a leading cause of vision impairment and blindness among individuals with diabetes,necessitating innovative approaches to screening and management.This editorial explores the transformative potential of artificial intelligence(AI)and machine learning(ML)in revolutionizing DR care.AI and ML technologies have demonstrated remarkable advancements in enhancing the accuracy,efficiency,and accessibility of DR screening,helping to overcome barriers to early detection.These technologies leverage vast datasets to identify patterns and predict disease progression with unprecedented precision,enabling clinicians to make more informed decisions.Furthermore,AI-driven solutions hold promise in personalizing management strategies for DR,incorpo-rating predictive analytics to tailor interventions and optimize treatment path-ways.By automating routine tasks,AI can reduce the burden on healthcare providers,allowing for a more focused allocation of resources towards complex patient care.This review aims to evaluate the current advancements and applic-ations of AI and ML in DR screening,and to discuss the potential of these techno-logies in developing personalized management strategies,ultimately aiming to improve patient outcomes and reduce the global burden of DR.The integration of AI and ML in DR care represents a paradigm shift,offering a glimpse into the future of ophthalmic healthcare.展开更多
BACKGROUND Machine learning(ML),a major branch of artificial intelligence,has not only demonstrated the potential to significantly improve numerous sectors of healthcare but has also made significant contributions to ...BACKGROUND Machine learning(ML),a major branch of artificial intelligence,has not only demonstrated the potential to significantly improve numerous sectors of healthcare but has also made significant contributions to the field of solid organ transplantation.ML provides revolutionary opportunities in areas such as donorrecipient matching,post-transplant monitoring,and patient care by automatically analyzing large amounts of data,identifying patterns,and forecasting outcomes.AIM To conduct a comprehensive bibliometric analysis of publications on the use of ML in transplantation to understand current research trends and their implications.METHODS On July 18,a thorough search strategy was used with the Web of Science database.ML and transplantation-related keywords were utilized.With the aid of the VOS viewer application,the identified articles were subjected to bibliometric variable analysis in order to determine publication counts,citation counts,contributing countries,and institutions,among other factors.RESULTS Of the 529 articles that were first identified,427 were deemed relevant for bibliometric analysis.A surge in publications was observed over the last four years,especially after 2018,signifying growing interest in this area.With 209 publications,the United States emerged as the top contributor.Notably,the"Journal of Heart and Lung Transplantation"and the"American Journal of Transplantation"emerged as the leading journals,publishing the highest number of relevant articles.Frequent keyword searches revealed that patient survival,mortality,outcomes,allocation,and risk assessment were significant themes of focus.CONCLUSION The growing body of pertinent publications highlights ML's growing presence in the field of solid organ transplantation.This bibliometric analysis highlights the growing importance of ML in transplant research and highlights its exciting potential to change medical practices and enhance patient outcomes.Encouraging collaboration between significant contributors can potentially fast-track advancements in this interdisciplinary domain.展开更多
BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery...BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery.AIM To develop predictive models utilizing machine learning(ML)methods to detect early-stage patients at a high risk of mortality.METHODS Eight hundred and eight patients with HCC at Beijing Ditan Hospital were randomly allocated to training and validation cohorts in a 2:1 ratio.Prognostic models were generated using random survival forests and artificial neural networks(ANNs).These ML models were compared with other classic HCC scoring systems.A decision-tree model was established to validate the contri-bution of immune-inflammatory indicators to the long-term outlook of patients with early-stage HCC.RESULTS Immune-inflammatory markers,albumin-bilirubin scores,alpha-fetoprotein,tumor size,and International Normalized Ratio were closely associated with the 5-year survival rates.Among various predictive models,the ANN model gene-rated using these indicators through ML algorithms exhibited superior perfor-mance,with a 5-year area under the curve(AUC)of 0.85(95%CI:0.82-0.88).In the validation cohort,the 5-year AUC was 0.82(95%CI:0.74-0.85).According to the ANN model,patients were classified into high-risk and low-risk groups,with an overall survival hazard ratio of 7.98(95%CI:5.85-10.93,P<0.0001)between the two cohorts.INTRODUCTION Hepatocellular carcinoma(HCC)is one of the six most prevalent cancers[1]and the third leading cause of cancer-related mortality[2].China has some of the highest incidence and mortality rates for liver cancer,accounting for half of global cases[3,4].The Barcelona Clinic Liver Cancer(BCLC)Staging System is the most widely used framework for diagnosing and treating HCC[5].The optimal candidates for surgical treatment are those with early-stage HCC,classified as BCLC stage 0 or A.Patients with early-stage liver cancer typically have a better prognosis after surgical resection,achieving a 5-year survival rate of 60%-70%[6].However,the high postoperative recurrence rates of HCC remain a major obstacle to long-term efficacy.To improve the prognosis of patients with early-stage HCC,it is necessary to develop models that can identify those with poor prognoses,enabling stratified and personalized treatment and follow-up strategies.Chronic inflammation is linked to the development and advancement of tumors[7].Recently,peripheral blood immune indicators,such as neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),and lymphocyte-to-monocyte ratio(LMR),have garnered extensive attention and have been used to predict survival in various tumors and inflammation-related diseases[8-10].However,the relationship between these combinations of immune markers and the outcomes in patients with early-stage HCC require further investigation.Machine learning(ML)algorithms are capable of handling large and complex datasets,generating more accurate and personalized predictions through unique training algorithms that better manage nonlinear statistical relationships than traditional analytical methods.Commonly used ML models include artificial neural networks(ANNs)and random survival forests(RSFs),which have shown satisfactory accuracy in prognostic predictions across various cancers and other diseases[11-13].ANNs have performed well in identifying the progression from liver cirrhosis to HCC and predicting overall survival(OS)in patients with HCC[14,15].However,no studies have confirmed the ability of ML models to predict post-surgical survival in patients with early-stage HCC.Through ML,a better understanding of the risk factors for early-stage HCC prognosis can be achieved.This aids in surgical decision-making,identifying patients at a high risk of mortality,and selecting subsequent treatment strategies.In this study,we aimed to establish a 5-year prognostic model for patients with early-stage HCC after surgical resection,based on ML and systemic immune-inflammatory indicators.This model seeks to improve the early monitoring of high-risk patients and provide personalized treatment plans.展开更多
Anastomotic leakage(AL)is a significant complication following rectal cancer surgery,adversely affecting both quality of life and oncological outcomes.Recent advancements in artificial intelligence(AI),particularly ma...Anastomotic leakage(AL)is a significant complication following rectal cancer surgery,adversely affecting both quality of life and oncological outcomes.Recent advancements in artificial intelligence(AI),particularly machine learning and deep learning,offer promising avenues for predicting and preventing AL.These technologies can analyze extensive clinical datasets to identify preoperative and perioperative risk factors such as malnutrition,body composition,and radiological features.AI-based models have demonstrated superior predictive power compared to traditional statistical methods,potentially guiding clinical decisionmaking and improving patient outcomes.Additionally,AI can provide surgeons with intraoperative feedback on blood supply and anatomical dissection planes,minimizing the risk of intraoperative complications and reducing the likelihood of AL development.展开更多
Seasonal precipitation has always been a key focus of climate prediction.As a dynamic-statistical combined method,the existing observational constraint correction establishes a regression relationship between the nume...Seasonal precipitation has always been a key focus of climate prediction.As a dynamic-statistical combined method,the existing observational constraint correction establishes a regression relationship between the numerical model outputs and historical observations,which can partly predict seasonal precipitation.However,solving a nonlinear problem through linear regression is significantly biased.This study implements a nonlinear optimization of an existing observational constrained correction model using a Light Gradient Boosting Machine(LightGBM)machine learning algorithm based on output from the Beijing National Climate Center Climate System Model(BCC-CSM)and station observations to improve the prediction of summer precipitation in China.The model was trained using a rolling approach,and LightGBM outperformed Linear Regression(LR),Extreme Gradient Boosting(XGBoost),and Categorical Boosting(CatBoost).Using parameter tuning to optimize the machine learning model and predict future summer precipitation using eight different predictors in BCC-CSM,the mean Anomaly Correlation Coefficient(ACC)score in the 2019–22 summer precipitation predictions was 0.17,and the mean Prediction Score(PS)reached 74.The PS score was improved by 7.87%and 6.63%compared with the BCC-CSM and the linear observational constraint approach,respectively.The observational constraint correction prediction strategy with LightGBM significantly and stably improved the prediction of summer precipitation in China compared to the previous linear observational constraint solution,providing a reference for flood control and drought relief during the flood season(summer)in China.展开更多
The roles of diurnal temperature in providing heat accumulation and chilling requirements for vegetation spring phenology differ.Although previous studies have established a stronger correlation between leaf onset and...The roles of diurnal temperature in providing heat accumulation and chilling requirements for vegetation spring phenology differ.Although previous studies have established a stronger correlation between leaf onset and diurnal temperature than between leaf onset and average temperature,current research on modeling spring phenology based on diurnal temperature indicators remains limited.In this study,we confirmed the start of the growing season(SOS)sensitivity to diurnal temperature and average temperature in boreal forest.The estimation of SOS was carried out by employing K-Nearest Neighbor Regression(KNR-TDN)model,Random Forest Regres-sion(RFR-TDN)model,eXtreme Gradient Boosting(XGB-TDN)model and Light Gradient Boosting Machine model(LightGBM-TDN)driven by diurnal temperature indicators during 1982-2015,and the SOS was projected from 2015 to 2100 based on the Coupled Model Intercomparison Project Phase 6(CMIP6)climate scenario datasets.The sensitivity of boreal forest SOS to daytime temperature is greater than that to average temperature and nighttime temperature.The LightGBM-TDN model perform best across all vegetation types,exhibiting the lowest RMSE and bias compared to the KNR-TDN model,RFR-TDN model and XGB-TDN model.By incorporating diurn-al temperature indicators instead of relying only on average temperature indicators to simulate spring phenology,an improvement in the accuracy of the model is achieved.Furthermore,the preseason accumulated daytime temperature,daytime temperature and snow cover end date emerged as significant drivers of the SOS simulation in the study area.The simulation results based on LightGBM-TDN model exhibit a trend of advancing SOS followed by stabilization under future climate scenarios.This study underscores the potential of diurn-al temperature indicators as a viable alternative to average temperature indicators in driving spring phenology models,offering a prom-ising new method for simulating spring phenology.展开更多
Seismic fragility analysis(SFA)is known as an effective probabilistic-based approach used to evaluate seismic fragility.There are various sources of uncertainties associated with this approach.A nuclear power plant(NP...Seismic fragility analysis(SFA)is known as an effective probabilistic-based approach used to evaluate seismic fragility.There are various sources of uncertainties associated with this approach.A nuclear power plant(NPP)system is an extremely important infrastructure and contains many structural uncertainties due to construction issues or structural deterioration during service.Simulation of structural uncertainties effects is a costly and time-consuming endeavor.A novel approach to SFA for the NPP considering structural uncertainties based on the damage state is proposed and examined.The results suggest that considering the structural uncertainties is essential in assessing the fragility of the NPP structure,and the impact of structural uncertainties tends to increase with the state of damage.Subsequently,machine learning(ML)is found to be superior in high-precision damage state identification of the NPP for reducing the time of nonlinear time-history analysis(NLTHA)and could be applied in the damage state-based SFA.Also,the impact of various sources of uncertainties is investigated through sensitivity analysis.The Sobol and Shapley additive explanations(SHAP)method can be complementary to each other and able to solve the problem of quantifying seismic and structural uncertainties simultaneously and the interaction effect of each parameter.展开更多
BACKGROUND Transjugular intrahepatic portosystemic shunt(TIPS)is an effective intervention for managing complications of portal hypertension,particularly acute variceal bleeding(AVB).While effective in reducing portal...BACKGROUND Transjugular intrahepatic portosystemic shunt(TIPS)is an effective intervention for managing complications of portal hypertension,particularly acute variceal bleeding(AVB).While effective in reducing portal pressure and preventing rebleeding,TIPS is associated with a considerable risk of overt hepatic encephalopathy(OHE),a complication that significantly elevates mortality rates.AIM To develop a machine learning(ML)model to predict OHE occurrence post-TIPS in patients with AVB using a 5-year dataset.METHODS This retrospective single-center study included 218 patients with AVB who underwent TIPS.The dataset was divided into training(70%)and testing(30%)sets.Critical features were identified using embedded methods and recursive feature elimination.Three ML algorithms-random forest,extreme gradient boosting,and logistic regression-were validated via 10-fold cross-validation.SHapley Additive exPlanations analysis was employed to interpret the model’s predictions.Survival analysis was conducted using Kaplan-Meier curves and stepwise Cox regression analysis to compare overall survival(OS)between patients with and without OHE.RESULTS The median OS of the study cohort was 47.83±22.95 months.Among the models evaluated,logistic regression demonstrated the highest performance with an area under the curve(AUC)of 0.825.Key predictors identified were Child-Pugh score,age,and portal vein thrombosis.Kaplan-Meier analysis revealed that patients without OHE had a significantly longer OS(P=0.005).The 5-year survival rate was 78.4%,with an OHE incidence of 15.1%.Both actual OHE status and predicted OHE value were significant predictors in each Cox model,with model-predicted OHE achieving an AUC of 88.1 in survival prediction.CONCLUSION The ML model accurately predicts post-TIPS OHE and outperforms traditional models,supporting its use in improving outcomes in patients with AVB.展开更多
Objective To establish and validate a novel diabetic retinopathy(DR)risk-prediction model using a whole-exome sequencing(WES)-based machine learning(ML)method.Methods WES was performed to identify potential single nuc...Objective To establish and validate a novel diabetic retinopathy(DR)risk-prediction model using a whole-exome sequencing(WES)-based machine learning(ML)method.Methods WES was performed to identify potential single nucleotide polymorphism(SNP)or mutation sites in a DR pedigree comprising 10 members.A prediction model was established and validated in a cohort of 420 type 2 diabetic patients based on both genetic and demographic features.The contribution of each feature was assessed using Shapley Additive explanation analysis.The efficacies of the models with and without SNP were compared.Results WES revealed that seven SNPs/mutations(rs116911833 in TRIM7,1997T>C in LRBA,1643T>C in PRMT10,rs117858678 in C9orf152,rs201922794 in CLDN25,rs146694895 in SH3GLB2,and rs201407189 in FANCC)were associated with DR.Notably,the model including rs146694895 and rs201407189 achieved better performance in predicting DR(accuracy:80.2%;sensitivity:83.3%;specificity:76.7%;area under the receiver operating characteristic curve[AUC]:80.0%)than the model without these SNPs(accuracy:79.4%;sensitivity:80.3%;specificity:78.3%;AUC:79.3%).Conclusion Novel SNP sites associated with DR were identified in the DR pedigree.Inclusion of rs146694895 and rs201407189 significantly enhanced the performance of the ML-based DR prediction model.展开更多
Accurately estimating protein–ligand binding free energy is crucial for drug design and biophysics, yet remains a challenging task. In this study, we applied the screening molecular mechanics/Poisson–Boltzmann surfa...Accurately estimating protein–ligand binding free energy is crucial for drug design and biophysics, yet remains a challenging task. In this study, we applied the screening molecular mechanics/Poisson–Boltzmann surface area(MM/PBSA)method in combination with various machine learning techniques to compute the binding free energies of protein–ligand interactions. Our results demonstrate that machine learning outperforms direct screening MM/PBSA calculations in predicting protein–ligand binding free energies. Notably, the random forest(RF) method exhibited the best predictive performance,with a Pearson correlation coefficient(rp) of 0.702 and a mean absolute error(MAE) of 1.379 kcal/mol. Furthermore, we analyzed feature importance rankings in the gradient boosting(GB), adaptive boosting(Ada Boost), and RF methods, and found that feature selection significantly impacted predictive performance. In particular, molecular weight(MW) and van der Waals(VDW) energies played a decisive role in the prediction. Overall, this study highlights the potential of combining machine learning methods with screening MM/PBSA for accurately predicting binding free energies in biosystems.展开更多
The unique long-range disordered atomic arrangement inherent in amorphous materials endows them with a range of superior properties,rendering them highly promising for applications in catalysis,medicine,and battery te...The unique long-range disordered atomic arrangement inherent in amorphous materials endows them with a range of superior properties,rendering them highly promising for applications in catalysis,medicine,and battery technology,among other fields.Since not all materials can be synthesized into an amorphous structure,the composition design of amorphous materials holds significant importance.Machine learning offers a valuable alternative to traditional“trial-anderror”methods by predicting properties through experimental data,thus providing efficient guidance in material design.In this study,we develop a machine learning workflow to predict the critical casting diameter,glass transition temperature,and Young's modulus for 45 ternary reported amorphous alloy systems.The predicted results have been organized into a database,enabling direct retrieval of predicted values based on compositional information.Furthermore,the applications of high glass forming ability region screening for specified system,multi-property target system screening and high glass forming ability region search through iteration are also demonstrated.By utilizing machine learning predictions,researchers can effectively narrow the experimental scope and expedite the exploration of compositions.展开更多
Machine learning(ML) is well suited for the prediction of high-complexity,high-dimensional problems such as those encountered in terminal ballistics.We evaluate the performance of four popular ML-based regression mode...Machine learning(ML) is well suited for the prediction of high-complexity,high-dimensional problems such as those encountered in terminal ballistics.We evaluate the performance of four popular ML-based regression models,extreme gradient boosting(XGBoost),artificial neural network(ANN),support vector regression(SVR),and Gaussian process regression(GP),on two common terminal ballistics’ problems:(a)predicting the V50ballistic limit of monolithic metallic armour impacted by small and medium calibre projectiles and fragments,and(b) predicting the depth to which a projectile will penetrate a target of semi-infinite thickness.To achieve this we utilise two datasets,each consisting of approximately 1000samples,collated from public release sources.We demonstrate that all four model types provide similarly excellent agreement when interpolating within the training data and diverge when extrapolating outside this range.Although extrapolation is not advisable for ML-based regression models,for applications such as lethality/survivability analysis,such capability is required.To circumvent this,we implement expert knowledge and physics-based models via enforced monotonicity,as a Gaussian prior mean,and through a modified loss function.The physics-informed models demonstrate improved performance over both classical physics-based models and the basic ML regression models,providing an ability to accurately fit experimental data when it is available and then revert to the physics-based model when not.The resulting models demonstrate high levels of predictive accuracy over a very wide range of projectile types,target materials and thicknesses,and impact conditions significantly more diverse than that achievable from any existing analytical approach.Compared with numerical analysis tools such as finite element solvers the ML models run orders of magnitude faster.We provide some general guidelines throughout for the development,application,and reporting of ML models in terminal ballistics problems.展开更多
Additive manufacturing technology is highly regarded due to its advantages,such as high precision and the ability to address complex geometric challenges.However,the development of additive manufacturing process is co...Additive manufacturing technology is highly regarded due to its advantages,such as high precision and the ability to address complex geometric challenges.However,the development of additive manufacturing process is constrained by issues like unclear fundamental principles,complex experimental cycles,and high costs.Machine learning,as a novel artificial intelligence technology,has the potential to deeply engage in the development of additive manufacturing process,assisting engineers in learning and developing new techniques.This paper provides a comprehensive overview of the research and applications of machine learning in the field of additive manufacturing,particularly in model design and process development.Firstly,it introduces the background and significance of machine learning-assisted design in additive manufacturing process.It then further delves into the application of machine learning in additive manufacturing,focusing on model design and process guidance.Finally,it concludes by summarizing and forecasting the development trends of machine learning technology in the field of additive manufacturing.展开更多
The potential for reducing greenhouse gas(GHG)emissions and energy consumption in wastewater treatment can be realized through intelligent control,with machine learning(ML)and multimodality emerging as a promising sol...The potential for reducing greenhouse gas(GHG)emissions and energy consumption in wastewater treatment can be realized through intelligent control,with machine learning(ML)and multimodality emerging as a promising solution.Here,we introduce an ML technique based on multimodal strategies,focusing specifically on intelligent aeration control in wastewater treatment plants(WWTPs).The generalization of the multimodal strategy is demonstrated on eight ML models.The results demonstrate that this multimodal strategy significantly enhances model indicators for ML in environmental science and the efficiency of aeration control,exhibiting exceptional performance and interpretability.Integrating random forest with visual models achieves the highest accuracy in forecasting aeration quantity in multimodal models,with a mean absolute percentage error of 4.4%and a coefficient of determination of 0.948.Practical testing in a full-scale plant reveals that the multimodal model can reduce operation costs by 19.8%compared to traditional fuzzy control methods.The potential application of these strategies in critical water science domains is discussed.To foster accessibility and promote widespread adoption,the multimodal ML models are freely available on GitHub,thereby eliminating technical barriers and encouraging the application of artificial intelligence in urban wastewater treatment.展开更多
文摘The rapid growth of machine learning(ML)across fields has intensified the challenge of selecting the right algorithm for specific tasks,known as the Algorithm Selection Problem(ASP).Traditional trial-and-error methods have become impractical due to their resource demands.Automated Machine Learning(AutoML)systems automate this process,but often neglect the group structures and sparsity in meta-features,leading to inefficiencies in algorithm recommendations for classification tasks.This paper proposes a meta-learning approach using Multivariate Sparse Group Lasso(MSGL)to address these limitations.Our method models both within-group and across-group sparsity among meta-features to manage high-dimensional data and reduce multicollinearity across eight meta-feature groups.The Fast Iterative Shrinkage-Thresholding Algorithm(FISTA)with adaptive restart efficiently solves the non-smooth optimization problem.Empirical validation on 145 classification datasets with 17 classification algorithms shows that our meta-learning method outperforms four state-of-the-art approaches,achieving 77.18%classification accuracy,86.07%recommendation accuracy and 88.83%normalized discounted cumulative gain.
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
基金the University of Transport Technology under grant number DTTD2022-12.
文摘Determination of Shear Bond strength(SBS)at interlayer of double-layer asphalt concrete is crucial in flexible pavement structures.The study used three Machine Learning(ML)models,including K-Nearest Neighbors(KNN),Extra Trees(ET),and Light Gradient Boosting Machine(LGBM),to predict SBS based on easily determinable input parameters.Also,the Grid Search technique was employed for hyper-parameter tuning of the ML models,and cross-validation and learning curve analysis were used for training the models.The models were built on a database of 240 experimental results and three input variables:temperature,normal pressure,and tack coat rate.Model validation was performed using three statistical criteria:the coefficient of determination(R2),the Root Mean Square Error(RMSE),and the mean absolute error(MAE).Additionally,SHAP analysis was also used to validate the importance of the input variables in the prediction of the SBS.Results show that these models accurately predict SBS,with LGBM providing outstanding performance.SHAP(Shapley Additive explanation)analysis for LGBM indicates that temperature is the most influential factor on SBS.Consequently,the proposed ML models can quickly and accurately predict SBS between two layers of asphalt concrete,serving practical applications in flexible pavement structure design.
基金financial support from the National Key Research and Development Program of China(2021YFB 3501501)the National Natural Science Foundation of China(No.22225803,22038001,22108007 and 22278011)+1 种基金Beijing Natural Science Foundation(No.Z230023)Beijing Science and Technology Commission(No.Z211100004321001).
文摘The high porosity and tunable chemical functionality of metal-organic frameworks(MOFs)make it a promising catalyst design platform.High-throughput screening of catalytic performance is feasible since the large MOF structure database is available.In this study,we report a machine learning model for high-throughput screening of MOF catalysts for the CO_(2) cycloaddition reaction.The descriptors for model training were judiciously chosen according to the reaction mechanism,which leads to high accuracy up to 97%for the 75%quantile of the training set as the classification criterion.The feature contribution was further evaluated with SHAP and PDP analysis to provide a certain physical understanding.12,415 hypothetical MOF structures and 100 reported MOFs were evaluated under 100℃ and 1 bar within one day using the model,and 239 potentially efficient catalysts were discovered.Among them,MOF-76(Y)achieved the top performance experimentally among reported MOFs,in good agreement with the prediction.
基金supported by the SP2024/089 Project by the Faculty of Materials Science and Technology,VˇSB-Technical University of Ostrava.
文摘In engineering practice,it is often necessary to determine functional relationships between dependent and independent variables.These relationships can be highly nonlinear,and classical regression approaches cannot always provide sufficiently reliable solutions.Nevertheless,Machine Learning(ML)techniques,which offer advanced regression tools to address complicated engineering issues,have been developed and widely explored.This study investigates the selected ML techniques to evaluate their suitability for application in the hot deformation behavior of metallic materials.The ML-based regression methods of Artificial Neural Networks(ANNs),Support Vector Machine(SVM),Decision Tree Regression(DTR),and Gaussian Process Regression(GPR)are applied to mathematically describe hot flow stress curve datasets acquired experimentally for a medium-carbon steel.Although the GPR method has not been used for such a regression task before,the results showed that its performance is the most favorable and practically unrivaled;neither the ANN method nor the other studied ML techniques provide such precise results of the solved regression analysis.
基金Funded by the Natural Science Foundation of China(No.52109168)。
文摘In order to study the characteristics of pure fly ash-based geopolymer concrete(PFGC)conveniently,we used a machine learning method that can quantify the perception of characteristics to predict its compressive strength.In this study,505 groups of data were collected,and a new database of compressive strength of PFGC was constructed.In order to establish an accurate prediction model of compressive strength,five different types of machine learning networks were used for comparative analysis.The five machine learning models all showed good compressive strength prediction performance on PFGC.Among them,R2,MSE,RMSE and MAE of decision tree model(DT)are 0.99,1.58,1.25,and 0.25,respectively.While R2,MSE,RMSE and MAE of random forest model(RF)are 0.97,5.17,2.27 and 1.38,respectively.The two models have high prediction accuracy and outstanding generalization ability.In order to enhance the interpretability of model decision-making,we used importance ranking to obtain the perception of machine learning model to 13 variables.These 13 variables include chemical composition of fly ash(SiO_(2)/Al_(2)O_(3),Si/Al),the ratio of alkaline liquid to the binder,curing temperature,curing durations inside oven,fly ash dosage,fine aggregate dosage,coarse aggregate dosage,extra water dosage and sodium hydroxide dosage.Curing temperature,specimen ages and curing durations inside oven have the greatest influence on the prediction results,indicating that curing conditions have more prominent influence on the compressive strength of PFGC than ordinary Portland cement concrete.The importance of curing conditions of PFGC even exceeds that of the concrete mix proportion,due to the low reactivity of pure fly ash.
文摘Diabetic retinopathy(DR)remains a leading cause of vision impairment and blindness among individuals with diabetes,necessitating innovative approaches to screening and management.This editorial explores the transformative potential of artificial intelligence(AI)and machine learning(ML)in revolutionizing DR care.AI and ML technologies have demonstrated remarkable advancements in enhancing the accuracy,efficiency,and accessibility of DR screening,helping to overcome barriers to early detection.These technologies leverage vast datasets to identify patterns and predict disease progression with unprecedented precision,enabling clinicians to make more informed decisions.Furthermore,AI-driven solutions hold promise in personalizing management strategies for DR,incorpo-rating predictive analytics to tailor interventions and optimize treatment path-ways.By automating routine tasks,AI can reduce the burden on healthcare providers,allowing for a more focused allocation of resources towards complex patient care.This review aims to evaluate the current advancements and applic-ations of AI and ML in DR screening,and to discuss the potential of these techno-logies in developing personalized management strategies,ultimately aiming to improve patient outcomes and reduce the global burden of DR.The integration of AI and ML in DR care represents a paradigm shift,offering a glimpse into the future of ophthalmic healthcare.
文摘BACKGROUND Machine learning(ML),a major branch of artificial intelligence,has not only demonstrated the potential to significantly improve numerous sectors of healthcare but has also made significant contributions to the field of solid organ transplantation.ML provides revolutionary opportunities in areas such as donorrecipient matching,post-transplant monitoring,and patient care by automatically analyzing large amounts of data,identifying patterns,and forecasting outcomes.AIM To conduct a comprehensive bibliometric analysis of publications on the use of ML in transplantation to understand current research trends and their implications.METHODS On July 18,a thorough search strategy was used with the Web of Science database.ML and transplantation-related keywords were utilized.With the aid of the VOS viewer application,the identified articles were subjected to bibliometric variable analysis in order to determine publication counts,citation counts,contributing countries,and institutions,among other factors.RESULTS Of the 529 articles that were first identified,427 were deemed relevant for bibliometric analysis.A surge in publications was observed over the last four years,especially after 2018,signifying growing interest in this area.With 209 publications,the United States emerged as the top contributor.Notably,the"Journal of Heart and Lung Transplantation"and the"American Journal of Transplantation"emerged as the leading journals,publishing the highest number of relevant articles.Frequent keyword searches revealed that patient survival,mortality,outcomes,allocation,and risk assessment were significant themes of focus.CONCLUSION The growing body of pertinent publications highlights ML's growing presence in the field of solid organ transplantation.This bibliometric analysis highlights the growing importance of ML in transplant research and highlights its exciting potential to change medical practices and enhance patient outcomes.Encouraging collaboration between significant contributors can potentially fast-track advancements in this interdisciplinary domain.
基金Supported by High-Level Chinese Medicine Key Discipline Construction Project,No.zyyzdxk-2023005Capital Health Development Research Project,No.2024-1-2173the National Natural Science Foundation of China,No.82474426 and No.82474419。
文摘BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery.AIM To develop predictive models utilizing machine learning(ML)methods to detect early-stage patients at a high risk of mortality.METHODS Eight hundred and eight patients with HCC at Beijing Ditan Hospital were randomly allocated to training and validation cohorts in a 2:1 ratio.Prognostic models were generated using random survival forests and artificial neural networks(ANNs).These ML models were compared with other classic HCC scoring systems.A decision-tree model was established to validate the contri-bution of immune-inflammatory indicators to the long-term outlook of patients with early-stage HCC.RESULTS Immune-inflammatory markers,albumin-bilirubin scores,alpha-fetoprotein,tumor size,and International Normalized Ratio were closely associated with the 5-year survival rates.Among various predictive models,the ANN model gene-rated using these indicators through ML algorithms exhibited superior perfor-mance,with a 5-year area under the curve(AUC)of 0.85(95%CI:0.82-0.88).In the validation cohort,the 5-year AUC was 0.82(95%CI:0.74-0.85).According to the ANN model,patients were classified into high-risk and low-risk groups,with an overall survival hazard ratio of 7.98(95%CI:5.85-10.93,P<0.0001)between the two cohorts.INTRODUCTION Hepatocellular carcinoma(HCC)is one of the six most prevalent cancers[1]and the third leading cause of cancer-related mortality[2].China has some of the highest incidence and mortality rates for liver cancer,accounting for half of global cases[3,4].The Barcelona Clinic Liver Cancer(BCLC)Staging System is the most widely used framework for diagnosing and treating HCC[5].The optimal candidates for surgical treatment are those with early-stage HCC,classified as BCLC stage 0 or A.Patients with early-stage liver cancer typically have a better prognosis after surgical resection,achieving a 5-year survival rate of 60%-70%[6].However,the high postoperative recurrence rates of HCC remain a major obstacle to long-term efficacy.To improve the prognosis of patients with early-stage HCC,it is necessary to develop models that can identify those with poor prognoses,enabling stratified and personalized treatment and follow-up strategies.Chronic inflammation is linked to the development and advancement of tumors[7].Recently,peripheral blood immune indicators,such as neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),and lymphocyte-to-monocyte ratio(LMR),have garnered extensive attention and have been used to predict survival in various tumors and inflammation-related diseases[8-10].However,the relationship between these combinations of immune markers and the outcomes in patients with early-stage HCC require further investigation.Machine learning(ML)algorithms are capable of handling large and complex datasets,generating more accurate and personalized predictions through unique training algorithms that better manage nonlinear statistical relationships than traditional analytical methods.Commonly used ML models include artificial neural networks(ANNs)and random survival forests(RSFs),which have shown satisfactory accuracy in prognostic predictions across various cancers and other diseases[11-13].ANNs have performed well in identifying the progression from liver cirrhosis to HCC and predicting overall survival(OS)in patients with HCC[14,15].However,no studies have confirmed the ability of ML models to predict post-surgical survival in patients with early-stage HCC.Through ML,a better understanding of the risk factors for early-stage HCC prognosis can be achieved.This aids in surgical decision-making,identifying patients at a high risk of mortality,and selecting subsequent treatment strategies.In this study,we aimed to establish a 5-year prognostic model for patients with early-stage HCC after surgical resection,based on ML and systemic immune-inflammatory indicators.This model seeks to improve the early monitoring of high-risk patients and provide personalized treatment plans.
文摘Anastomotic leakage(AL)is a significant complication following rectal cancer surgery,adversely affecting both quality of life and oncological outcomes.Recent advancements in artificial intelligence(AI),particularly machine learning and deep learning,offer promising avenues for predicting and preventing AL.These technologies can analyze extensive clinical datasets to identify preoperative and perioperative risk factors such as malnutrition,body composition,and radiological features.AI-based models have demonstrated superior predictive power compared to traditional statistical methods,potentially guiding clinical decisionmaking and improving patient outcomes.Additionally,AI can provide surgeons with intraoperative feedback on blood supply and anatomical dissection planes,minimizing the risk of intraoperative complications and reducing the likelihood of AL development.
基金jointly supported by the National Natural Science Foundation of China(Grant Nos.42122034,42075043,42330609)the Second Tibetan Plateau Scientific Expedition and Research program(2019QZKK0103)+2 种基金Key Talent Project in Gansu and Central Guidance Fund for Local Science and Technology Development Projects in Gansu(No.24ZYQA031)the Youth Innovation Promotion Association of Chinese Academy of Sciences(2021427)West Light Foundation of the Chinese Academy of Sciences(xbzg-zdsys-202215)。
文摘Seasonal precipitation has always been a key focus of climate prediction.As a dynamic-statistical combined method,the existing observational constraint correction establishes a regression relationship between the numerical model outputs and historical observations,which can partly predict seasonal precipitation.However,solving a nonlinear problem through linear regression is significantly biased.This study implements a nonlinear optimization of an existing observational constrained correction model using a Light Gradient Boosting Machine(LightGBM)machine learning algorithm based on output from the Beijing National Climate Center Climate System Model(BCC-CSM)and station observations to improve the prediction of summer precipitation in China.The model was trained using a rolling approach,and LightGBM outperformed Linear Regression(LR),Extreme Gradient Boosting(XGBoost),and Categorical Boosting(CatBoost).Using parameter tuning to optimize the machine learning model and predict future summer precipitation using eight different predictors in BCC-CSM,the mean Anomaly Correlation Coefficient(ACC)score in the 2019–22 summer precipitation predictions was 0.17,and the mean Prediction Score(PS)reached 74.The PS score was improved by 7.87%and 6.63%compared with the BCC-CSM and the linear observational constraint approach,respectively.The observational constraint correction prediction strategy with LightGBM significantly and stably improved the prediction of summer precipitation in China compared to the previous linear observational constraint solution,providing a reference for flood control and drought relief during the flood season(summer)in China.
基金Under the auspices of National Natural Science Foundation of China(No.42201374,42071359)。
文摘The roles of diurnal temperature in providing heat accumulation and chilling requirements for vegetation spring phenology differ.Although previous studies have established a stronger correlation between leaf onset and diurnal temperature than between leaf onset and average temperature,current research on modeling spring phenology based on diurnal temperature indicators remains limited.In this study,we confirmed the start of the growing season(SOS)sensitivity to diurnal temperature and average temperature in boreal forest.The estimation of SOS was carried out by employing K-Nearest Neighbor Regression(KNR-TDN)model,Random Forest Regres-sion(RFR-TDN)model,eXtreme Gradient Boosting(XGB-TDN)model and Light Gradient Boosting Machine model(LightGBM-TDN)driven by diurnal temperature indicators during 1982-2015,and the SOS was projected from 2015 to 2100 based on the Coupled Model Intercomparison Project Phase 6(CMIP6)climate scenario datasets.The sensitivity of boreal forest SOS to daytime temperature is greater than that to average temperature and nighttime temperature.The LightGBM-TDN model perform best across all vegetation types,exhibiting the lowest RMSE and bias compared to the KNR-TDN model,RFR-TDN model and XGB-TDN model.By incorporating diurn-al temperature indicators instead of relying only on average temperature indicators to simulate spring phenology,an improvement in the accuracy of the model is achieved.Furthermore,the preseason accumulated daytime temperature,daytime temperature and snow cover end date emerged as significant drivers of the SOS simulation in the study area.The simulation results based on LightGBM-TDN model exhibit a trend of advancing SOS followed by stabilization under future climate scenarios.This study underscores the potential of diurn-al temperature indicators as a viable alternative to average temperature indicators in driving spring phenology models,offering a prom-ising new method for simulating spring phenology.
基金National Natural Science Foundation of China under Grant Nos.52208191 and 51908397Shanxi Province Science Foundation for Youths under Grant No.201901D211025China Postdoctoral Science Foundation under Grant No.2020M670695。
文摘Seismic fragility analysis(SFA)is known as an effective probabilistic-based approach used to evaluate seismic fragility.There are various sources of uncertainties associated with this approach.A nuclear power plant(NPP)system is an extremely important infrastructure and contains many structural uncertainties due to construction issues or structural deterioration during service.Simulation of structural uncertainties effects is a costly and time-consuming endeavor.A novel approach to SFA for the NPP considering structural uncertainties based on the damage state is proposed and examined.The results suggest that considering the structural uncertainties is essential in assessing the fragility of the NPP structure,and the impact of structural uncertainties tends to increase with the state of damage.Subsequently,machine learning(ML)is found to be superior in high-precision damage state identification of the NPP for reducing the time of nonlinear time-history analysis(NLTHA)and could be applied in the damage state-based SFA.Also,the impact of various sources of uncertainties is investigated through sensitivity analysis.The Sobol and Shapley additive explanations(SHAP)method can be complementary to each other and able to solve the problem of quantifying seismic and structural uncertainties simultaneously and the interaction effect of each parameter.
基金Natural Science Foundation of Guangdong Province,No.2024A1515013069.
文摘BACKGROUND Transjugular intrahepatic portosystemic shunt(TIPS)is an effective intervention for managing complications of portal hypertension,particularly acute variceal bleeding(AVB).While effective in reducing portal pressure and preventing rebleeding,TIPS is associated with a considerable risk of overt hepatic encephalopathy(OHE),a complication that significantly elevates mortality rates.AIM To develop a machine learning(ML)model to predict OHE occurrence post-TIPS in patients with AVB using a 5-year dataset.METHODS This retrospective single-center study included 218 patients with AVB who underwent TIPS.The dataset was divided into training(70%)and testing(30%)sets.Critical features were identified using embedded methods and recursive feature elimination.Three ML algorithms-random forest,extreme gradient boosting,and logistic regression-were validated via 10-fold cross-validation.SHapley Additive exPlanations analysis was employed to interpret the model’s predictions.Survival analysis was conducted using Kaplan-Meier curves and stepwise Cox regression analysis to compare overall survival(OS)between patients with and without OHE.RESULTS The median OS of the study cohort was 47.83±22.95 months.Among the models evaluated,logistic regression demonstrated the highest performance with an area under the curve(AUC)of 0.825.Key predictors identified were Child-Pugh score,age,and portal vein thrombosis.Kaplan-Meier analysis revealed that patients without OHE had a significantly longer OS(P=0.005).The 5-year survival rate was 78.4%,with an OHE incidence of 15.1%.Both actual OHE status and predicted OHE value were significant predictors in each Cox model,with model-predicted OHE achieving an AUC of 88.1 in survival prediction.CONCLUSION The ML model accurately predicts post-TIPS OHE and outperforms traditional models,supporting its use in improving outcomes in patients with AVB.
基金supported by the National Natural Science Foundation of China[Grant No.62206185]。
文摘Objective To establish and validate a novel diabetic retinopathy(DR)risk-prediction model using a whole-exome sequencing(WES)-based machine learning(ML)method.Methods WES was performed to identify potential single nucleotide polymorphism(SNP)or mutation sites in a DR pedigree comprising 10 members.A prediction model was established and validated in a cohort of 420 type 2 diabetic patients based on both genetic and demographic features.The contribution of each feature was assessed using Shapley Additive explanation analysis.The efficacies of the models with and without SNP were compared.Results WES revealed that seven SNPs/mutations(rs116911833 in TRIM7,1997T>C in LRBA,1643T>C in PRMT10,rs117858678 in C9orf152,rs201922794 in CLDN25,rs146694895 in SH3GLB2,and rs201407189 in FANCC)were associated with DR.Notably,the model including rs146694895 and rs201407189 achieved better performance in predicting DR(accuracy:80.2%;sensitivity:83.3%;specificity:76.7%;area under the receiver operating characteristic curve[AUC]:80.0%)than the model without these SNPs(accuracy:79.4%;sensitivity:80.3%;specificity:78.3%;AUC:79.3%).Conclusion Novel SNP sites associated with DR were identified in the DR pedigree.Inclusion of rs146694895 and rs201407189 significantly enhanced the performance of the ML-based DR prediction model.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 12222506, 12347102, 12447164, and 12174184)。
文摘Accurately estimating protein–ligand binding free energy is crucial for drug design and biophysics, yet remains a challenging task. In this study, we applied the screening molecular mechanics/Poisson–Boltzmann surface area(MM/PBSA)method in combination with various machine learning techniques to compute the binding free energies of protein–ligand interactions. Our results demonstrate that machine learning outperforms direct screening MM/PBSA calculations in predicting protein–ligand binding free energies. Notably, the random forest(RF) method exhibited the best predictive performance,with a Pearson correlation coefficient(rp) of 0.702 and a mean absolute error(MAE) of 1.379 kcal/mol. Furthermore, we analyzed feature importance rankings in the gradient boosting(GB), adaptive boosting(Ada Boost), and RF methods, and found that feature selection significantly impacted predictive performance. In particular, molecular weight(MW) and van der Waals(VDW) energies played a decisive role in the prediction. Overall, this study highlights the potential of combining machine learning methods with screening MM/PBSA for accurately predicting binding free energies in biosystems.
基金Project supported by funding from the National Natural Science Foundation of China(Grant Nos.52172258,52473227 and 52171150)the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDB0500200)。
文摘The unique long-range disordered atomic arrangement inherent in amorphous materials endows them with a range of superior properties,rendering them highly promising for applications in catalysis,medicine,and battery technology,among other fields.Since not all materials can be synthesized into an amorphous structure,the composition design of amorphous materials holds significant importance.Machine learning offers a valuable alternative to traditional“trial-anderror”methods by predicting properties through experimental data,thus providing efficient guidance in material design.In this study,we develop a machine learning workflow to predict the critical casting diameter,glass transition temperature,and Young's modulus for 45 ternary reported amorphous alloy systems.The predicted results have been organized into a database,enabling direct retrieval of predicted values based on compositional information.Furthermore,the applications of high glass forming ability region screening for specified system,multi-property target system screening and high glass forming ability region search through iteration are also demonstrated.By utilizing machine learning predictions,researchers can effectively narrow the experimental scope and expedite the exploration of compositions.
文摘Machine learning(ML) is well suited for the prediction of high-complexity,high-dimensional problems such as those encountered in terminal ballistics.We evaluate the performance of four popular ML-based regression models,extreme gradient boosting(XGBoost),artificial neural network(ANN),support vector regression(SVR),and Gaussian process regression(GP),on two common terminal ballistics’ problems:(a)predicting the V50ballistic limit of monolithic metallic armour impacted by small and medium calibre projectiles and fragments,and(b) predicting the depth to which a projectile will penetrate a target of semi-infinite thickness.To achieve this we utilise two datasets,each consisting of approximately 1000samples,collated from public release sources.We demonstrate that all four model types provide similarly excellent agreement when interpolating within the training data and diverge when extrapolating outside this range.Although extrapolation is not advisable for ML-based regression models,for applications such as lethality/survivability analysis,such capability is required.To circumvent this,we implement expert knowledge and physics-based models via enforced monotonicity,as a Gaussian prior mean,and through a modified loss function.The physics-informed models demonstrate improved performance over both classical physics-based models and the basic ML regression models,providing an ability to accurately fit experimental data when it is available and then revert to the physics-based model when not.The resulting models demonstrate high levels of predictive accuracy over a very wide range of projectile types,target materials and thicknesses,and impact conditions significantly more diverse than that achievable from any existing analytical approach.Compared with numerical analysis tools such as finite element solvers the ML models run orders of magnitude faster.We provide some general guidelines throughout for the development,application,and reporting of ML models in terminal ballistics problems.
基金financially supported by the Technology Development Fund of China Academy of Machinery Science and Technology(No.170221ZY01)。
文摘Additive manufacturing technology is highly regarded due to its advantages,such as high precision and the ability to address complex geometric challenges.However,the development of additive manufacturing process is constrained by issues like unclear fundamental principles,complex experimental cycles,and high costs.Machine learning,as a novel artificial intelligence technology,has the potential to deeply engage in the development of additive manufacturing process,assisting engineers in learning and developing new techniques.This paper provides a comprehensive overview of the research and applications of machine learning in the field of additive manufacturing,particularly in model design and process development.Firstly,it introduces the background and significance of machine learning-assisted design in additive manufacturing process.It then further delves into the application of machine learning in additive manufacturing,focusing on model design and process guidance.Finally,it concludes by summarizing and forecasting the development trends of machine learning technology in the field of additive manufacturing.
基金the financial support by the National Natural Science Foundation of China(52230004 and 52293445)the Key Research and Development Project of Shandong Province(2020CXGC011202-005)the Shenzhen Science and Technology Program(KCXFZ20211020163404007 and KQTD20190929172630447).
文摘The potential for reducing greenhouse gas(GHG)emissions and energy consumption in wastewater treatment can be realized through intelligent control,with machine learning(ML)and multimodality emerging as a promising solution.Here,we introduce an ML technique based on multimodal strategies,focusing specifically on intelligent aeration control in wastewater treatment plants(WWTPs).The generalization of the multimodal strategy is demonstrated on eight ML models.The results demonstrate that this multimodal strategy significantly enhances model indicators for ML in environmental science and the efficiency of aeration control,exhibiting exceptional performance and interpretability.Integrating random forest with visual models achieves the highest accuracy in forecasting aeration quantity in multimodal models,with a mean absolute percentage error of 4.4%and a coefficient of determination of 0.948.Practical testing in a full-scale plant reveals that the multimodal model can reduce operation costs by 19.8%compared to traditional fuzzy control methods.The potential application of these strategies in critical water science domains is discussed.To foster accessibility and promote widespread adoption,the multimodal ML models are freely available on GitHub,thereby eliminating technical barriers and encouraging the application of artificial intelligence in urban wastewater treatment.