Virtual screening of compound databases is a promising approach to identify inhibitors of DNA methyltransferases and other epigenetic targets. An important first step before conducting virtual screening is to characte...Virtual screening of compound databases is a promising approach to identify inhibitors of DNA methyltransferases and other epigenetic targets. An important first step before conducting virtual screening is to characterize the structural diversity and chemical space coverage of the screening collections. Herein, we report a comprehensive chemoinformatic characterization of novel screening libraries, including a focused collection directed to inhibitors of DNA methyltransferases (DNMTs), and two natural product databases. The compound databases were assessed in terms of physicochemical properties, molecular scaffolds, and fingerprints. As part of the scaffold diversity analysis, a recently developed method, based on Shannon Entropy, was used. The overall approach enabled the analysis of property space coverage, degree of overlap between collections, scaffold and structural diversity. Overall, the analysis of the distribution of physicochemical properties indicates that the DNMT focused library and the two natural products collections have molecules with properties similar to approved drugs. Moreover, the natural products databases analyzed in this work have different chemical structures from approved drugs and synthetic databases and therefore are attractive for virtual screening for DNMT inhibitors. The scaffold analysis revealed that the focused library has, overall, the largest scaffold diversity and that the most frequent scaffolds are not identified in the other analyzed collections. Therefore, the focused library is also attractive to perform virtual and experimental screening for novel inhibitors. This study represents a first step towards the virtual screening of novel compound databases to identify inhibitors of DNMTs. Results of this study are general and can be used for the virtual screening of the compound databases against targets directed to other therapeutic applications.展开更多
文摘Virtual screening of compound databases is a promising approach to identify inhibitors of DNA methyltransferases and other epigenetic targets. An important first step before conducting virtual screening is to characterize the structural diversity and chemical space coverage of the screening collections. Herein, we report a comprehensive chemoinformatic characterization of novel screening libraries, including a focused collection directed to inhibitors of DNA methyltransferases (DNMTs), and two natural product databases. The compound databases were assessed in terms of physicochemical properties, molecular scaffolds, and fingerprints. As part of the scaffold diversity analysis, a recently developed method, based on Shannon Entropy, was used. The overall approach enabled the analysis of property space coverage, degree of overlap between collections, scaffold and structural diversity. Overall, the analysis of the distribution of physicochemical properties indicates that the DNMT focused library and the two natural products collections have molecules with properties similar to approved drugs. Moreover, the natural products databases analyzed in this work have different chemical structures from approved drugs and synthetic databases and therefore are attractive for virtual screening for DNMT inhibitors. The scaffold analysis revealed that the focused library has, overall, the largest scaffold diversity and that the most frequent scaffolds are not identified in the other analyzed collections. Therefore, the focused library is also attractive to perform virtual and experimental screening for novel inhibitors. This study represents a first step towards the virtual screening of novel compound databases to identify inhibitors of DNMTs. Results of this study are general and can be used for the virtual screening of the compound databases against targets directed to other therapeutic applications.