Metal-organic frameworks(MOFs) containing open metal sites are important materials for acetylene(C_(2)H_(2)) adsorption.However,it is inefficient or even impossible to search suitable MOFs by molecular simulation meth...Metal-organic frameworks(MOFs) containing open metal sites are important materials for acetylene(C_(2)H_(2)) adsorption.However,it is inefficient or even impossible to search suitable MOFs by molecular simulation method in nearly infinite MOFs space.Therefore,machine learning(ML) methods are adopted in the material screening and prediction of high-performance MOFs.In this paper,architecture,chemical and structural features are used to analyze the C_(2)H_(2) adsorption performance of the MOFs.Different ML algorithms are applied to perform classification and regression analysis to the factors affecting material adsorption.By decision tree(DT) algorithm,it is found that only PV,GSA,and Cu-OMS are sufficient to determine the high adsorption of the MOFs.Furthermore,the influence of topology on the performance of MOFs is obtained.Gradient Boosting Decision Tree(GBDT),Support Vector Machine(SVM),and Back Propagation Neural Network(BPNN),are introduced to analyze the quantitative structure-property relationship(QSPR) between C_(2)H_(2) adsorption and the features of MOFs.The prediction of the GBDT model is found to have the highest accuracy,with R~2 as 0.93 and RMSE as 11.58.In addition,the GBDT model is used for feature analysis,and the contribution of each feature to the performance is obtained,which is of great significance for the design and analysis of MOFs.The successful application of ML to MOFs screening greatly reduce the calculation time and provides important reference for the design and synthesis of new MOFs.展开更多
A combination of computational materials screening and machine learning(ML)technique is being adopted as a popular approach to study various materials toward application of interest.In this work,we began with high-thr...A combination of computational materials screening and machine learning(ML)technique is being adopted as a popular approach to study various materials toward application of interest.In this work,we began with high-throughput molecular simulations to calculate the methane storage(6.5 MPa)and deliverable(6.5-0.58 MPa)capacities of 404,460 covalent organic frameworks(COFs)at 298 K.Then,the full data sets with 23 features were randomly split into training and test sets in a ratio of 20:80,which were applied to evaluate the prediction abilities of several ML algorithms,including gradient boosting decision tree(GBDT),neural network(NN),support vector machine(SVM),random forest(RF)and decision tree(DT).The results indicate that the RF model has the highest prediction accuracy,which was further employed to reduce the dimension of features space and quantitatively analyze the relative importance of each feature value.The binary classification predictors built using the features with the highest influence weight can give a successful identification of top-performing candidates from the test set containing 323,168 COFs with an accuracy exceeding 96%.The deliverable capacities of the identified COFs were found to outperform those reported so far for various adsorbents.The findings may provide a useful guidance for the design and synthesis of new high-performance materials for methane storage application.展开更多
基金The financial supports of the National Natural Science Foundation of China (No. 22078004)the Fundamental Research Funds for the Central Universities (No. buctrc201727)the Big Science Project from BUCT are greatly appreciated。
文摘Metal-organic frameworks(MOFs) containing open metal sites are important materials for acetylene(C_(2)H_(2)) adsorption.However,it is inefficient or even impossible to search suitable MOFs by molecular simulation method in nearly infinite MOFs space.Therefore,machine learning(ML) methods are adopted in the material screening and prediction of high-performance MOFs.In this paper,architecture,chemical and structural features are used to analyze the C_(2)H_(2) adsorption performance of the MOFs.Different ML algorithms are applied to perform classification and regression analysis to the factors affecting material adsorption.By decision tree(DT) algorithm,it is found that only PV,GSA,and Cu-OMS are sufficient to determine the high adsorption of the MOFs.Furthermore,the influence of topology on the performance of MOFs is obtained.Gradient Boosting Decision Tree(GBDT),Support Vector Machine(SVM),and Back Propagation Neural Network(BPNN),are introduced to analyze the quantitative structure-property relationship(QSPR) between C_(2)H_(2) adsorption and the features of MOFs.The prediction of the GBDT model is found to have the highest accuracy,with R~2 as 0.93 and RMSE as 11.58.In addition,the GBDT model is used for feature analysis,and the contribution of each feature to the performance is obtained,which is of great significance for the design and analysis of MOFs.The successful application of ML to MOFs screening greatly reduce the calculation time and provides important reference for the design and synthesis of new MOFs.
基金the National Natural Science Foundation of China(22078004)the Fundamental Research Funds for the Central Universities(buctrc201727)the Big Science Project from BUCT(XK180301).
文摘A combination of computational materials screening and machine learning(ML)technique is being adopted as a popular approach to study various materials toward application of interest.In this work,we began with high-throughput molecular simulations to calculate the methane storage(6.5 MPa)and deliverable(6.5-0.58 MPa)capacities of 404,460 covalent organic frameworks(COFs)at 298 K.Then,the full data sets with 23 features were randomly split into training and test sets in a ratio of 20:80,which were applied to evaluate the prediction abilities of several ML algorithms,including gradient boosting decision tree(GBDT),neural network(NN),support vector machine(SVM),random forest(RF)and decision tree(DT).The results indicate that the RF model has the highest prediction accuracy,which was further employed to reduce the dimension of features space and quantitatively analyze the relative importance of each feature value.The binary classification predictors built using the features with the highest influence weight can give a successful identification of top-performing candidates from the test set containing 323,168 COFs with an accuracy exceeding 96%.The deliverable capacities of the identified COFs were found to outperform those reported so far for various adsorbents.The findings may provide a useful guidance for the design and synthesis of new high-performance materials for methane storage application.