The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It au...The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expec- tation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHO-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval SHC-EM outperforms the traditional variational 1earning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning.展开更多
The increased demand for superior materials has highlighted the need of investigating the mechanical properties of composites to achieve enhanced constitutive relationships.Fiber-reinforced polymer composites have eme...The increased demand for superior materials has highlighted the need of investigating the mechanical properties of composites to achieve enhanced constitutive relationships.Fiber-reinforced polymer composites have emerged as an integral part of materials development with tailored mechanical properties.However,the complexity and heterogeneity of such composites make it considerably more challenging to have precise quantification of properties and attain an optimal design of structures through experimental and computational approaches.In order to avoid the complex,cumbersome,and labor-intensive experimental and numerical modeling approaches,a machine learning(ML)model is proposed here such that it takes the microstructural image as input with a different range of Young’s modulus of carbon fibers and neat epoxy,and obtains output as visualization of the stress component S11(principal stress in the x-direction).For obtaining the training data of the ML model,a short carbon fiberfilled specimen under quasi-static tension is modeled based on 2D Representative Area Element(RAE)using finite element analysis.The composite is inclusive of short carbon fibers with an aspect ratio of 7.5that are infilled in the epoxy systems at various random orientations and positions generated using the Simple Sequential Inhibition(SSI)process.The study reveals that the pix2pix deep learning Convolutional Neural Network(CNN)model is robust enough to predict the stress fields in the composite for a given arrangement of short fibers filled in epoxy over the specified range of Young’s modulus with high accuracy.The CNN model achieves a correlation score of about 0.999 and L2 norm of less than 0.005 for a majority of the samples in the design spectrum,indicating excellent prediction capability.In this paper,we have focused on the stage-wise chronological development of the CNN model with optimized performance for predicting the full-field stress maps of the fiber-reinforced composite specimens.The development of such a robust and efficient algorithm would significantly reduce the amount of time and cost required to study and design new composite materials through the elimination of numerical inputs by direct microstructural images.展开更多
In this study,machine learning representation is introduced to evaluate the flexoelectricity effect in truncated pyramid nanostructure under compression.A Non-Uniform Rational B-spline(NURBS)based IGA formulation is e...In this study,machine learning representation is introduced to evaluate the flexoelectricity effect in truncated pyramid nanostructure under compression.A Non-Uniform Rational B-spline(NURBS)based IGA formulation is employed to model the flexoelectricity.We investigate 2D system with an isotropic linear elastic material under plane strain conditions discretized by 45×30 grid of B-spline elements.Six input parameters are selected to construct a deep neural network(DNN)model.They are the Young's modulus,two dielectric permittivity constants,the longitudinal and transversal flexoelectric coefficients and the order of the shape function.The outputs of interest are the strain in the stress direction and the electric potential due flexoelectricity.The dataset are generated from the forward analysis of the flexoelectric model.80%of the dataset is used for training purpose while the remaining is used for validation by checking the mean squared error.In addition to the input and output layers,the developed DNN model is composed of four hidden layers.The results showed high predictions capabilities of the proposed method with much lower computational time in comparison to the numerical model.展开更多
The development of machine learning in complex system is hindered by two problems nowadays.The first problem is the inefficiency of exploration in state and action space,which leads to the data-hungry of some state-of...The development of machine learning in complex system is hindered by two problems nowadays.The first problem is the inefficiency of exploration in state and action space,which leads to the data-hungry of some state-of-art data-driven algorithm.The second problem is the lack of a general theory which can be used to analyze and implement a complex learning system.In this paper,we proposed a general methods that can address both two issues.We combine the concepts of descriptive learning,predictive learning,and prescriptive learning into a uniform framework,so as to build a parallel system allowing learning system improved by self-boosting.Formulating a new perspective of data,knowledge and action,we provide a new methodology called parallel learning to design machine learning system for real-world problems.展开更多
In this paper, a new machine learning framework is developed for complex system control, called parallel reinforcement learning. To overcome data deficiency of current data-driven algorithms, a parallel system is buil...In this paper, a new machine learning framework is developed for complex system control, called parallel reinforcement learning. To overcome data deficiency of current data-driven algorithms, a parallel system is built to improve complex learning system by self-guidance. Based on the Markov chain(MC) theory, we combine the transfer learning, predictive learning, deep learning and reinforcement learning to tackle the data and action processes and to express the knowledge. Parallel reinforcement learning framework is formulated and several case studies for real-world problems are finally introduced.展开更多
Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies,massive protein-protein interacti...Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies,massive protein-protein interaction(PPI)data have been generated,making it very difficult to analyze them efficiently.To address this problem,this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms,i.e.,CoFex,using MapReduce.To do so,an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for large-scale PPI data analysis and prediction.Respective solutions are then devised to overcome these limitations.In particular,we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins.After that,its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy.Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.展开更多
During construction,the shield linings of tunnels often face the problem of local or overall upward movement after leaving the shield tail in soft soil areas or during some large diameter shield projects.Differential ...During construction,the shield linings of tunnels often face the problem of local or overall upward movement after leaving the shield tail in soft soil areas or during some large diameter shield projects.Differential floating will increase the initial stress on the segments and bolts which is harmful to the service performance of the tunnel.In this study we used a random forest(RF)algorithm combined particle swarm optimization(PSO)and 5-fold cross-validation(5-fold CV)to predict the maximum upward displacement of tunnel linings induced by shield tunnel excavation.The mechanism and factors causing upward movement of the tunnel lining are comprehensively summarized.Twelve input variables were selected according to results from analysis of influencing factors.The prediction performance of two models,PSO-RF and RF(default)were compared.The Gini value was obtained to represent the relative importance of the influencing factors to the upward displacement of linings.The PSO-RF model successfully predicted the maximum upward displacement of the tunnel linings with a low error(mean absolute error(MAE)=4.04 mm,root mean square error(RMSE)=5.67 mm)and high correlation(R^(2)=0.915).The thrust and depth of the tunnel were the most important factors in the prediction model influencing the upward displacement of the tunnel linings.展开更多
The transparent open box(TOB)learning network algorithm offers an alternative approach to the lack of transparency provided by most machine-learning algorithms.It provides the exact calculations and relationships amon...The transparent open box(TOB)learning network algorithm offers an alternative approach to the lack of transparency provided by most machine-learning algorithms.It provides the exact calculations and relationships among the underlying input variables of the datasets to which it is applied.It also has the capability to achieve credible and auditable levels of prediction accuracy to complex,non-linear datasets,typical of those encountered in the oil and gas sector,highlighting the potential for underfitting and overfitting.The algorithm is applied here to predict bubble-point pressure from a published PVT dataset of 166 data records involving four easy-tomeasure variables(reservoir temperature,gas-oil ratio,oil gravity,gas density relative to air)with uneven,and in parts,sparse data coverage.The TOB network demonstrates high-prediction accuracy for this complex system,although it predictions applied to the full dataset are outperformed by an artificial neural network(ANN).However,the performance of the TOB algorithm reveals the risk of overfitting in the sparse areas of the dataset and achieves a prediction performance that matches the ANN algorithm where the underlying data population is adequate.The high levels of transparency and its inhibitions to overfitting enable the TOB learning network to provide complementary information about the underlying dataset to that provided by traditional machine learning algorithms.This makes them suitable for application in parallel with neural-network algorithms,to overcome their black-box tendencies,and for benchmarking the prediction performance of other machine learning algorithms.展开更多
With the goal of predicting the future rainfall intensity in a local region over a relatively short period time,precipitation nowcasting has been a long-time scientific challenge with great social and economic impact....With the goal of predicting the future rainfall intensity in a local region over a relatively short period time,precipitation nowcasting has been a long-time scientific challenge with great social and economic impact.The radar echo extrapolation approaches for precipitation nowcasting take radar echo images as input,aiming to generate future radar echo images by learning from the historical images.To effectively handle complex and high non-stationary evolution of radar echoes,we propose to decompose the movement into optical flow field motion and morphologic deformation.Following this idea,we introduce Flow-Deformation Network(FDNet),a neural network that models flow and deformation in two parallel cross pathways.The flow encoder captures the optical flow field motion between consecutive images and the deformation encoder distinguishes the change of shape from the translational motion of radar echoes.We evaluate the proposed network architecture on two real-world radar echo datasets.Our model achieves state-of-the-art prediction results compared with recent approaches.To the best of our knowledge,this is the first network architecture with flow and deformation separation to model the evolution of radar echoes for precipitation nowcasting.We believe that the general idea of this work could not only inspire much more effective approaches but also be applied to other similar spatio-temporal prediction tasks.展开更多
Epilepsy is the most common neurological disorder of the brain that affects people worldwide at any age from newborn to adult. It is characterized by recurrent seizures, which are brief episodes of signs or symptoms d...Epilepsy is the most common neurological disorder of the brain that affects people worldwide at any age from newborn to adult. It is characterized by recurrent seizures, which are brief episodes of signs or symptoms due to abnormal excessive or synchronous neuronal activity in the brain. The electroencephalogram, or EEG, is a physiological method to measure and record the electrical展开更多
Recent years have witnessed the transformative impact from the integration of artificial intelligence with organic and polymer synthesis. This synergy offers innovative and intelligent solutions to a range of classic ...Recent years have witnessed the transformative impact from the integration of artificial intelligence with organic and polymer synthesis. This synergy offers innovative and intelligent solutions to a range of classic problems in synthetic chemistry. These exciting advancements include the prediction of molecular property, multi-step retrosynthetic pathway planning, elucidation of the structure-performance relationship of single-step transformation, establishment of the quantitative linkage between polymer structures and their functions, design and optimization of polymerization process, prediction of the structure and sequence of biological macromolecules, as well as automated and intelligent synthesis platforms. Chemists can now explore synthetic chemistry with unprecedented precision and efficiency, creating novel reactions, catalysts, and polymer materials under the datadriven paradigm. Despite these thrilling developments, the field of artificial intelligence(AI) synthetic chemistry is still in its infancy, facing challenges and limitations in terms of data openness, model interpretability, as well as software and hardware support. This review aims to provide an overview of the current progress, key challenges, and future development suggestions in the interdisciplinary field between AI and synthetic chemistry. It is hoped that this overview will offer readers a comprehensive understanding of this emerging field, inspiring and promoting further scientific research and development.展开更多
Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential rel...Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential relations between different candidate algorithms for the algorithm selection. In this study, we propose an instancespecific algorithm selection method based on multi-output learning, which can manage these relations more directly.Three kinds of multi-output learning methods are used to predict the performances of the candidate algorithms:(1)multi-output regressor stacking;(2) multi-output extremely randomized trees; and(3) hybrid single-output and multioutput trees. The experimental results obtained using 11 SAT datasets and 5 Max SAT datasets indicate that our proposed methods can obtain a better performance over the state-of-the-art algorithm selection methods.展开更多
基金Supported by the National Natural Science Foundation of China under Grant No 60972106the China Postdoctoral Science Foundation under Grant No 2014M561053+1 种基金the Humanity and Social Science Foundation of Ministry of Education of China under Grant No 15YJA630108the Hebei Province Natural Science Foundation under Grant No E2016202341
文摘The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expec- tation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHO-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval SHC-EM outperforms the traditional variational 1earning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning.
基金financial support received from DST-SERBSRG/2020/000997,Indiathe initiation grant received from IIT Kanpur。
文摘The increased demand for superior materials has highlighted the need of investigating the mechanical properties of composites to achieve enhanced constitutive relationships.Fiber-reinforced polymer composites have emerged as an integral part of materials development with tailored mechanical properties.However,the complexity and heterogeneity of such composites make it considerably more challenging to have precise quantification of properties and attain an optimal design of structures through experimental and computational approaches.In order to avoid the complex,cumbersome,and labor-intensive experimental and numerical modeling approaches,a machine learning(ML)model is proposed here such that it takes the microstructural image as input with a different range of Young’s modulus of carbon fibers and neat epoxy,and obtains output as visualization of the stress component S11(principal stress in the x-direction).For obtaining the training data of the ML model,a short carbon fiberfilled specimen under quasi-static tension is modeled based on 2D Representative Area Element(RAE)using finite element analysis.The composite is inclusive of short carbon fibers with an aspect ratio of 7.5that are infilled in the epoxy systems at various random orientations and positions generated using the Simple Sequential Inhibition(SSI)process.The study reveals that the pix2pix deep learning Convolutional Neural Network(CNN)model is robust enough to predict the stress fields in the composite for a given arrangement of short fibers filled in epoxy over the specified range of Young’s modulus with high accuracy.The CNN model achieves a correlation score of about 0.999 and L2 norm of less than 0.005 for a majority of the samples in the design spectrum,indicating excellent prediction capability.In this paper,we have focused on the stage-wise chronological development of the CNN model with optimized performance for predicting the full-field stress maps of the fiber-reinforced composite specimens.The development of such a robust and efficient algorithm would significantly reduce the amount of time and cost required to study and design new composite materials through the elimination of numerical inputs by direct microstructural images.
文摘In this study,machine learning representation is introduced to evaluate the flexoelectricity effect in truncated pyramid nanostructure under compression.A Non-Uniform Rational B-spline(NURBS)based IGA formulation is employed to model the flexoelectricity.We investigate 2D system with an isotropic linear elastic material under plane strain conditions discretized by 45×30 grid of B-spline elements.Six input parameters are selected to construct a deep neural network(DNN)model.They are the Young's modulus,two dielectric permittivity constants,the longitudinal and transversal flexoelectric coefficients and the order of the shape function.The outputs of interest are the strain in the stress direction and the electric potential due flexoelectricity.The dataset are generated from the forward analysis of the flexoelectric model.80%of the dataset is used for training purpose while the remaining is used for validation by checking the mean squared error.In addition to the input and output layers,the developed DNN model is composed of four hidden layers.The results showed high predictions capabilities of the proposed method with much lower computational time in comparison to the numerical model.
基金supported in part by the National Natural Science Foundation of China(91520301)
文摘The development of machine learning in complex system is hindered by two problems nowadays.The first problem is the inefficiency of exploration in state and action space,which leads to the data-hungry of some state-of-art data-driven algorithm.The second problem is the lack of a general theory which can be used to analyze and implement a complex learning system.In this paper,we proposed a general methods that can address both two issues.We combine the concepts of descriptive learning,predictive learning,and prescriptive learning into a uniform framework,so as to build a parallel system allowing learning system improved by self-boosting.Formulating a new perspective of data,knowledge and action,we provide a new methodology called parallel learning to design machine learning system for real-world problems.
基金supported in part by the National Natural Science Foundation of China(61503380)the Natural Science Foundation of Guangdong Province,China(2015A030310187)
文摘In this paper, a new machine learning framework is developed for complex system control, called parallel reinforcement learning. To overcome data deficiency of current data-driven algorithms, a parallel system is built to improve complex learning system by self-guidance. Based on the Markov chain(MC) theory, we combine the transfer learning, predictive learning, deep learning and reinforcement learning to tackle the data and action processes and to express the knowledge. Parallel reinforcement learning framework is formulated and several case studies for real-world problems are finally introduced.
基金This work was supported in part by the National Natural Science Foundation of China(61772493)the CAAI-Huawei MindSpore Open Fund(CAAIXSJLJJ-2020-004B)+4 种基金the Natural Science Foundation of Chongqing(China)(cstc2019jcyjjqX0013)Chongqing Research Program of Technology Innovation and Application(cstc2019jscx-fxydX0024,cstc2019jscx-fxydX0027,cstc2018jszx-cyzdX0041)Guangdong Province Universities and College Pearl River Scholar Funded Scheme(2019)the Pioneer Hundred Talents Program of Chinese Academy of Sciencesthe Deanship of Scientific Research(DSR)at King Abdulaziz University(G-21-135-38).
文摘Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies,massive protein-protein interaction(PPI)data have been generated,making it very difficult to analyze them efficiently.To address this problem,this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms,i.e.,CoFex,using MapReduce.To do so,an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for large-scale PPI data analysis and prediction.Respective solutions are then devised to overcome these limitations.In particular,we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins.After that,its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy.Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.
基金supported by the Basic Science Center Program for Multiphase Evolution in Hyper Gravity of the National Natural Science Foundation of China(No.51988101)the National Natural Science Foundation of China(No.52178306)the Zhejiang Provincial Natural Science Foundation of China(No.LR19E080002).
文摘During construction,the shield linings of tunnels often face the problem of local or overall upward movement after leaving the shield tail in soft soil areas or during some large diameter shield projects.Differential floating will increase the initial stress on the segments and bolts which is harmful to the service performance of the tunnel.In this study we used a random forest(RF)algorithm combined particle swarm optimization(PSO)and 5-fold cross-validation(5-fold CV)to predict the maximum upward displacement of tunnel linings induced by shield tunnel excavation.The mechanism and factors causing upward movement of the tunnel lining are comprehensively summarized.Twelve input variables were selected according to results from analysis of influencing factors.The prediction performance of two models,PSO-RF and RF(default)were compared.The Gini value was obtained to represent the relative importance of the influencing factors to the upward displacement of linings.The PSO-RF model successfully predicted the maximum upward displacement of the tunnel linings with a low error(mean absolute error(MAE)=4.04 mm,root mean square error(RMSE)=5.67 mm)and high correlation(R^(2)=0.915).The thrust and depth of the tunnel were the most important factors in the prediction model influencing the upward displacement of the tunnel linings.
文摘The transparent open box(TOB)learning network algorithm offers an alternative approach to the lack of transparency provided by most machine-learning algorithms.It provides the exact calculations and relationships among the underlying input variables of the datasets to which it is applied.It also has the capability to achieve credible and auditable levels of prediction accuracy to complex,non-linear datasets,typical of those encountered in the oil and gas sector,highlighting the potential for underfitting and overfitting.The algorithm is applied here to predict bubble-point pressure from a published PVT dataset of 166 data records involving four easy-tomeasure variables(reservoir temperature,gas-oil ratio,oil gravity,gas density relative to air)with uneven,and in parts,sparse data coverage.The TOB network demonstrates high-prediction accuracy for this complex system,although it predictions applied to the full dataset are outperformed by an artificial neural network(ANN).However,the performance of the TOB algorithm reveals the risk of overfitting in the sparse areas of the dataset and achieves a prediction performance that matches the ANN algorithm where the underlying data population is adequate.The high levels of transparency and its inhibitions to overfitting enable the TOB learning network to provide complementary information about the underlying dataset to that provided by traditional machine learning algorithms.This makes them suitable for application in parallel with neural-network algorithms,to overcome their black-box tendencies,and for benchmarking the prediction performance of other machine learning algorithms.
基金supported in part by the National Key Research and Development Program of China under Grant No.2018YFC0831500the Beijing Natural Science Foundation under Grant No.JQ18001,and the Beijing Academy of Artificial Intelligence.
文摘With the goal of predicting the future rainfall intensity in a local region over a relatively short period time,precipitation nowcasting has been a long-time scientific challenge with great social and economic impact.The radar echo extrapolation approaches for precipitation nowcasting take radar echo images as input,aiming to generate future radar echo images by learning from the historical images.To effectively handle complex and high non-stationary evolution of radar echoes,we propose to decompose the movement into optical flow field motion and morphologic deformation.Following this idea,we introduce Flow-Deformation Network(FDNet),a neural network that models flow and deformation in two parallel cross pathways.The flow encoder captures the optical flow field motion between consecutive images and the deformation encoder distinguishes the change of shape from the translational motion of radar echoes.We evaluate the proposed network architecture on two real-world radar echo datasets.Our model achieves state-of-the-art prediction results compared with recent approaches.To the best of our knowledge,this is the first network architecture with flow and deformation separation to model the evolution of radar echoes for precipitation nowcasting.We believe that the general idea of this work could not only inspire much more effective approaches but also be applied to other similar spatio-temporal prediction tasks.
文摘Epilepsy is the most common neurological disorder of the brain that affects people worldwide at any age from newborn to adult. It is characterized by recurrent seizures, which are brief episodes of signs or symptoms due to abnormal excessive or synchronous neuronal activity in the brain. The electroencephalogram, or EEG, is a physiological method to measure and record the electrical
基金supported by the National Natural Science Foundation of China (22393890, You SL22393891 and 22031006,Luo S+16 种基金2203300, Pei J22371052, Chen M21991132, 21925102,92056118, and 22331003, Zhang WB22331002 and 22125101, Lu H22071004, Mo F22393892 and 22071249, Liao K22122109 and22271253, Hong X)the National Key R&D Program of China(2023YFF1205103, Pei J2020YFA0908100 and 2023YFF1204401, Zhang WB2022YFA1504301, Hong X)Zhejiang Provincial Natural Science Foundation of China (LDQ23B020002, Hong X)the Starry Night Science Fund of Zhejiang University Shanghai Institute for Advanced Study (SNZJU-SIAS-006, Hong X)the CAS Youth Interdisciplinary Team (JCTD-2021-11, Hong X)Shenzhen Medical Research Fund (B2302037, Zhang WB)Beijing National Laboratory for Molecular Sciences (BNLMSCXXM-202006, Zhang WB)the State Key Laboratory of Molecular Engineering of Polymers (Chen M)Haihe Laboratory of Sustainable Chemical Transformations and National Science&Technology Fundamental Resource Investigation Program of China (2023YFA1500008, Luo S)。
文摘Recent years have witnessed the transformative impact from the integration of artificial intelligence with organic and polymer synthesis. This synergy offers innovative and intelligent solutions to a range of classic problems in synthetic chemistry. These exciting advancements include the prediction of molecular property, multi-step retrosynthetic pathway planning, elucidation of the structure-performance relationship of single-step transformation, establishment of the quantitative linkage between polymer structures and their functions, design and optimization of polymerization process, prediction of the structure and sequence of biological macromolecules, as well as automated and intelligent synthesis platforms. Chemists can now explore synthetic chemistry with unprecedented precision and efficiency, creating novel reactions, catalysts, and polymer materials under the datadriven paradigm. Despite these thrilling developments, the field of artificial intelligence(AI) synthetic chemistry is still in its infancy, facing challenges and limitations in terms of data openness, model interpretability, as well as software and hardware support. This review aims to provide an overview of the current progress, key challenges, and future development suggestions in the interdisciplinary field between AI and synthetic chemistry. It is hoped that this overview will offer readers a comprehensive understanding of this emerging field, inspiring and promoting further scientific research and development.
基金mainly supported by the National Natural Science Foundation of China(Nos.61125201,61303070,and U1435219)
文摘Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential relations between different candidate algorithms for the algorithm selection. In this study, we propose an instancespecific algorithm selection method based on multi-output learning, which can manage these relations more directly.Three kinds of multi-output learning methods are used to predict the performances of the candidate algorithms:(1)multi-output regressor stacking;(2) multi-output extremely randomized trees; and(3) hybrid single-output and multioutput trees. The experimental results obtained using 11 SAT datasets and 5 Max SAT datasets indicate that our proposed methods can obtain a better performance over the state-of-the-art algorithm selection methods.