摘要
6 Atomic fragment types of organic compound have been defined, and the multilevel atom-pair frequency matrix has been constructed according to the occurrence number in pairs of atomic fragments with different bond lengths in the molecule. On the basis of them, a novel molecular coding technique: characteristic atom-pair holographic code (CAHC), is obtained. To some extent, this method exhibits a large number of benefits at the same time. For example, it can calculate 2D molecular topological descriptor easily, operate without difficulty and possess definite physicochemical meaning of 3D molecular structural characterization methods, and may fetch the complicated information of molecule, etc. Therefore, it is appropriate for the study on quantitative structure-property/activity relationship (QSPR/QSAR) of medicines and biological molecules. We attempt in this paper to utilize the method of CAHC to the quantitative prediction of reversed-phase liquid chromatogram (RPLC) retention data of 33 purine derivatives and 24 steroids. The fitting multiple correlation coefficient R2, cross-validated multiple correlation coefficient Q2 and predicted ability Q^2 pred over test set's samples of obtained partial least-square (PLS) regression model are respectively 0.990, 0.893 and 0.977, 0.897, 0.941.
6 Atomic fragment types of organic compound have been defined, and the multilevel atom-pair frequency matrix has been constructed according to the occurrence number in pairs of atomic fragments with different bond lengths in the molecule. On the basis of them, a novel molecular coding technique: characteristic atom-pair holographic code (CAHC), is obtained. To some extent, this method exhibits a large number of benefits at the same time. For example, it can calculate 2D molecular topological descriptor easily, operate without difficulty and possess definite physicochemical meaning of 3D molecular structural characterization methods, and may fetch the complicated information of molecule, etc. Therefore, it is appropriate for the study on quantitative structure-property/activity relationship (QSPR/QSAR) of medicines and biological molecules. We attempt in this paper to utilize the method of CAHC to the quantitative prediction of reversed-phase liquid chromatogram (RPLC) retention data of 33 purine derivatives and 24 steroids. The fitting multiple correlation coefficient R2, cross-validated multiple correlation coefficient Q2 and predicted ability Q^2 pred over test set's samples of obtained partial least-square (PLS) regression model are respectively 0.990, 0.893 and 0.977, 0.897, 0.941.
基金
This work was supported by the State Key Laboratory of Chemo/Biosensing and Chemometrics Foundation (No. 05-12-1), Fok-Yingtung Educational Foundation (No. 98-7-6) and Chongqing University Innovation Foundation of Science and Technology ( No. 06-1-1)