Generating molecules with desired properties is an important task in chemistry and pharmacy.An efficient method may have a positive impact on finding drugs to treat diseases like COVID-19.Data mining and artificial in...Generating molecules with desired properties is an important task in chemistry and pharmacy.An efficient method may have a positive impact on finding drugs to treat diseases like COVID-19.Data mining and artificial intelligence may be good ways to find an efficient method.Recently,both the generative models based on deep learning and the work based on genetic algorithms have made some progress in generating molecules and optimizing the molecule’s properties.However,existing methods have defects in the experimental evaluation standards.These methods also need to be improved in efficiency and performance.To solve these problems,we propose a method named the Chemical Genetic Algorithm for Large Molecular Space(CALM).Specifically,CALM employs a scalable and efficient molecular representation called molecular matrix.And we design corresponding crossover,mutation,and mask operators inspired by domain knowledge and previous studies.We apply our genetic algorithm to several tasks related to molecular property optimization and constraint molecular optimization.The results of these tasks show that our approach outperforms the other state-of-the-art deep learning and genetic algorithm methods,where the z tests performed on the results of several experiments show that our method is more than 99%likely to be significant.At the same time,based on the experimental results,we point out the defects in the experimental evaluation standard which affects the fair evaluation of all previous work.Avoiding these defects helps to objectively evaluate the performance of all work.展开更多
基金the National Key Research and Development Program of China under Grant No.2016YFB1000904the National Natural Science Foundation of China under Grant Nos.61922073 and U20A20229the Youth Innovation Promotion Association of Chinese Academy of Sciences under Grant No.2014299。
文摘Generating molecules with desired properties is an important task in chemistry and pharmacy.An efficient method may have a positive impact on finding drugs to treat diseases like COVID-19.Data mining and artificial intelligence may be good ways to find an efficient method.Recently,both the generative models based on deep learning and the work based on genetic algorithms have made some progress in generating molecules and optimizing the molecule’s properties.However,existing methods have defects in the experimental evaluation standards.These methods also need to be improved in efficiency and performance.To solve these problems,we propose a method named the Chemical Genetic Algorithm for Large Molecular Space(CALM).Specifically,CALM employs a scalable and efficient molecular representation called molecular matrix.And we design corresponding crossover,mutation,and mask operators inspired by domain knowledge and previous studies.We apply our genetic algorithm to several tasks related to molecular property optimization and constraint molecular optimization.The results of these tasks show that our approach outperforms the other state-of-the-art deep learning and genetic algorithm methods,where the z tests performed on the results of several experiments show that our method is more than 99%likely to be significant.At the same time,based on the experimental results,we point out the defects in the experimental evaluation standard which affects the fair evaluation of all previous work.Avoiding these defects helps to objectively evaluate the performance of all work.