A new descriptor, called vector of topological and structural information for coded and noncoded amino acids (VTSA), was derived by principal component analysis (PCA) from a matrix of 66 topological and structural var...A new descriptor, called vector of topological and structural information for coded and noncoded amino acids (VTSA), was derived by principal component analysis (PCA) from a matrix of 66 topological and structural variables of 134 amino acids. The VTSA vector was then applied into two sets of peptide quantitative structure-activity relationships or quantitative sequence-activity modelings (QSARs/ QSAMs). Molded by genetic partial least squares (GPLS), support vector machine (SVM), and immune neural network (INN), good results were obtained. For the datasets of 58 angiotensin converting en-zyme inhibitors (ACEI) and 89 elastase substrate catalyzed kinetics (ESCK) , the R2, cross-validation R2, and root mean square error of estimation (RMSEE) were as follows: ACEI, R2cu≥0.82, Q2cu≥0.77, Ermse≤0.44 (GPLS+SVM); ESCK, R2cu≥0.84, Q2cu≥0.82, Ermse≤0.20 (GPLS+INN), respectively.展开更多
A new set of descriptors,namely score vectors of the zero dimension,one dimension,two dimensions and three dimensions(SZOTT),was derived from principle component analysis of a matrix of 1369 structural variables inclu...A new set of descriptors,namely score vectors of the zero dimension,one dimension,two dimensions and three dimensions(SZOTT),was derived from principle component analysis of a matrix of 1369 structural variables including 0D,1D,2D and 3D information for the 20 coded amino acids. SZOTT scales were then used in cleavage site prediction of human immunodeficiency virus type 1 protease. Linear discriminant analysis(LDA) and support vector machines(SVM) were applied to developing models to predict the cleavage sites. The results obtained by linear discriminant analysis(LDA) and support vector machines(SVM) are as follows. The Matthews correlation coefficients(MCC) by the resubstitution test,leave-one-out cross validation(LOOCV) and external validation are 0.879 and 0.911,0.849 and 0.901,0.822 and 0.846,respectively. The receiver operating characteristic(ROC) analysis showed that the SVM model possesses better simulative and predictive ability in comparison with the LDA model. Satisfactory results show that SZOTT descriptors can be further used to predict cleavage sites of human immunodeficiency virus type 1 protease.展开更多
Both the concept and the model of snug quantitative structure-activity relationship (QSAR) were pro-posed and developed for molecular design through constructing QSAR based on some known mode of receptor/ligand intera...Both the concept and the model of snug quantitative structure-activity relationship (QSAR) were pro-posed and developed for molecular design through constructing QSAR based on some known mode of receptor/ligand interactions. Many disadvantages of traditional models can be avoided by using the proposed method because the traditional models only determined upon molecular structural features in sample sets themselves. A genetic virtual screening of peptide/protein combinations (GVSPPC) is proposed for the first time by utilizing this idea to examine peptide/protein affinity activities. A genetic algorithm (GA) was developed for screening combinative targets with an interaction mode for virtual receptors. GVSPPC succeeds in disposing difficulties in rational QSAR,in order to search for the ligand/receptor interactions on conditions of unknown structures. Some bioactive oligo-/poly-peptide systems covering 58 angiotensin converting enzyme (ACE) inhibitors and 18 double site mutation residues in camel antibody protein cAb-Lys3 were investigated by GVSPPC with satisfactory results (R 2 cu>0.91,Q 2 cv > 0.86,ERMS=0.19-0.95),respectively,which demonstrates that GVSPPC is more inter-pretable in the ligand-receptor interaction than the traditional QSAR method.展开更多
Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influe...Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.展开更多
基金the Foundations of National High Technology (863) Programme (Grant No. 2006AA02Z312)State New Drug Project (Grant No. 1996ND1035A01)+4 种基金Fok- Yingtung Educational Foundation (Grant No. 980706)State Key Laboratory of Chemo/Biosensing and Chemometrics Foundation (Grant No. KLCB005-0012)Chongqing University Innovation Fund (Grant No. CUIF030506)Chongqing Mu-nicipality Applied Science Fund (Grant No. CASF01-3-6)Momentous Juche Innovation Fund for Tackle Key Problem Items (Grant No. MJIF 06-9-9)
文摘A new descriptor, called vector of topological and structural information for coded and noncoded amino acids (VTSA), was derived by principal component analysis (PCA) from a matrix of 66 topological and structural variables of 134 amino acids. The VTSA vector was then applied into two sets of peptide quantitative structure-activity relationships or quantitative sequence-activity modelings (QSARs/ QSAMs). Molded by genetic partial least squares (GPLS), support vector machine (SVM), and immune neural network (INN), good results were obtained. For the datasets of 58 angiotensin converting en-zyme inhibitors (ACEI) and 89 elastase substrate catalyzed kinetics (ESCK) , the R2, cross-validation R2, and root mean square error of estimation (RMSEE) were as follows: ACEI, R2cu≥0.82, Q2cu≥0.77, Ermse≤0.44 (GPLS+SVM); ESCK, R2cu≥0.84, Q2cu≥0.82, Ermse≤0.20 (GPLS+INN), respectively.
基金Supported by the Research on National High-tech R&D Program (the 863 program) (Grant No. 2006AA02Z312)Innovative Group Program for Graduates of Chong- qing University, Science and Innovation Fund (Grant No. 200711C1A0010260)
文摘A new set of descriptors,namely score vectors of the zero dimension,one dimension,two dimensions and three dimensions(SZOTT),was derived from principle component analysis of a matrix of 1369 structural variables including 0D,1D,2D and 3D information for the 20 coded amino acids. SZOTT scales were then used in cleavage site prediction of human immunodeficiency virus type 1 protease. Linear discriminant analysis(LDA) and support vector machines(SVM) were applied to developing models to predict the cleavage sites. The results obtained by linear discriminant analysis(LDA) and support vector machines(SVM) are as follows. The Matthews correlation coefficients(MCC) by the resubstitution test,leave-one-out cross validation(LOOCV) and external validation are 0.879 and 0.911,0.849 and 0.901,0.822 and 0.846,respectively. The receiver operating characteristic(ROC) analysis showed that the SVM model possesses better simulative and predictive ability in comparison with the LDA model. Satisfactory results show that SZOTT descriptors can be further used to predict cleavage sites of human immunodeficiency virus type 1 protease.
基金the National Chunhui Project Foundation (Grant No. 99-4-4+37)the Fok-Yingtung Educational Foundation (Grant No. 98-7-6)+5 种基金the State New Drug Project (Grant No. 1996ND1035A01)the Chongqing Municipality Applied Funda-mental Science Fund (Grant No. 01-3-6)Juche Academic Innovation Foundation for Science and Technology (Grant No. 03ZY12XT06)Chongqing University Subject Constructive Fund (Grant No. 04-10-10)State Key Laboratory for Chemobiosensors and Chemometrics under MOST Fund (Grant No. 2005012)National High Technology Research and Development (863) Program of China (Grant No. 2006AA02Z312)
文摘Both the concept and the model of snug quantitative structure-activity relationship (QSAR) were pro-posed and developed for molecular design through constructing QSAR based on some known mode of receptor/ligand interactions. Many disadvantages of traditional models can be avoided by using the proposed method because the traditional models only determined upon molecular structural features in sample sets themselves. A genetic virtual screening of peptide/protein combinations (GVSPPC) is proposed for the first time by utilizing this idea to examine peptide/protein affinity activities. A genetic algorithm (GA) was developed for screening combinative targets with an interaction mode for virtual receptors. GVSPPC succeeds in disposing difficulties in rational QSAR,in order to search for the ligand/receptor interactions on conditions of unknown structures. Some bioactive oligo-/poly-peptide systems covering 58 angiotensin converting enzyme (ACE) inhibitors and 18 double site mutation residues in camel antibody protein cAb-Lys3 were investigated by GVSPPC with satisfactory results (R 2 cu>0.91,Q 2 cv > 0.86,ERMS=0.19-0.95),respectively,which demonstrates that GVSPPC is more inter-pretable in the ligand-receptor interaction than the traditional QSAR method.
基金Foundations of National High Technology (863) Programme (Grant No. 2006AA02Z312)Innovative Group Programme for Graduates of Chongqing Uni-versity, Science and Innovation Fund (Grant No. 200711C1A0010260)+4 种基金National 111 Programme Introducing Talents of Discipline to Universities (Grant No. 0507111106)Chongqing Municipality Basic and Applied Fundamental Science Fund (Grant No. 01-3-6)National Chunhui Project Foundation (Grant No. 99-4-4+3-7)State Key Laboratory of Chemo/Biosensing and Chemometrics Fund (Grant No.2005012)Fok-Yingtung Educational Foundation (Grant No. 98-7-6)
文摘Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.