Though alcohol oxidations were considered as well-established reactions,selecting productive conditions or predicting reaction yields for unseen alcohols remained as major challenges.Herein,an auto machine learning(ML...Though alcohol oxidations were considered as well-established reactions,selecting productive conditions or predicting reaction yields for unseen alcohols remained as major challenges.Herein,an auto machine learning(ML)model for TEMPO-catalyzed oxida-tion of primary alcohols to the corresponding carboxylic acids is disclosed.A dataset of 3444 data,consisting of 282 primary alco-hols and 45 conditions,were generated using high-throughput experimentation(HTE).With the HTE data and 105 descriptors,a multi-label prediction was performed with AutoGluon(an open-source auto machine learning framework)and KNIME(an open-source data analytics platform).For the independent test of 240 reactions(a full matrix of 20 unseen alcohols and 12 condi-tions),AutoGluon with multi-label prediction for yield prediction(AGMP)gave excellent performance.For external test of 1308 re-actions(consisting of 84 alcohols and 45 conditions),AGMP still afforded good results with R2 as 0.767 and MAE as 4.9%.The model also revealed that the newly generated descriptor(Y/N,classification of the reaction reactivity)was the most relevant descriptor for yield prediction,offering a new perspective to integrate HTE and ML in organic synthesis.展开更多
基金We are grateful for financial support from Guangzhou Laboratory,Bioland Laboratory,and the National Natural Science Foundation of China(No.22071249).
文摘Though alcohol oxidations were considered as well-established reactions,selecting productive conditions or predicting reaction yields for unseen alcohols remained as major challenges.Herein,an auto machine learning(ML)model for TEMPO-catalyzed oxida-tion of primary alcohols to the corresponding carboxylic acids is disclosed.A dataset of 3444 data,consisting of 282 primary alco-hols and 45 conditions,were generated using high-throughput experimentation(HTE).With the HTE data and 105 descriptors,a multi-label prediction was performed with AutoGluon(an open-source auto machine learning framework)and KNIME(an open-source data analytics platform).For the independent test of 240 reactions(a full matrix of 20 unseen alcohols and 12 condi-tions),AutoGluon with multi-label prediction for yield prediction(AGMP)gave excellent performance.For external test of 1308 re-actions(consisting of 84 alcohols and 45 conditions),AGMP still afforded good results with R2 as 0.767 and MAE as 4.9%.The model also revealed that the newly generated descriptor(Y/N,classification of the reaction reactivity)was the most relevant descriptor for yield prediction,offering a new perspective to integrate HTE and ML in organic synthesis.