醛糖还原酶与糖尿病并发症的发生有关,常见的糖尿病治疗药物常以该酶为作用靶标,但这些药物在应用中却存在毒副作用大的问题,寻找安全的醛糖还原酶抑制剂是目前功能性食品和医药研究的热点。源于植物的黄酮化合物虽具有较强的醛糖还原...醛糖还原酶与糖尿病并发症的发生有关,常见的糖尿病治疗药物常以该酶为作用靶标,但这些药物在应用中却存在毒副作用大的问题,寻找安全的醛糖还原酶抑制剂是目前功能性食品和医药研究的热点。源于植物的黄酮化合物虽具有较强的醛糖还原酶抑制活性,但其作用机制仍不明晰。鉴于此,本研究旨在运用分子模拟手段研究黄酮抑制醛糖还原酶的三维定量构效关系及作用模式。采用基于R基团搜索技术的比较分子场法(Topomer Co MFA)建立了39个类黄酮分子抑制醛糖还原酶的三维定量构效关系模型,并用包括12个样本的测试集验证模型的外部预测能力。所得模型的拟合、交互验证以及外部验证的相关系数分别为0.831,0.564和0.794。在此基础上,运用Surflex-dock分子对接法研究了黄酮与醛糖还原酶的结合模式。结果表明黄酮构型不同导致其在酶疏水性空腔中的取向不同,进而引起活性差异。当黄酮上的取代基分布符合立体场和静电场的修饰原则时,可显著改善黄酮与酶的结合,提高其醛糖还原酶抑制效果。对于开发新型的醛糖还原酶抑制剂,推动黄酮在功能性食品领域的应用具有一定的指导意义。展开更多
Support vector machine (SVM), partial least squares (PLS), and Back-Propagation artificial neural net- work (ANN) were employed to establish QSAR models of 2 dipeptide datasets. In order to validate predictive capabil...Support vector machine (SVM), partial least squares (PLS), and Back-Propagation artificial neural net- work (ANN) were employed to establish QSAR models of 2 dipeptide datasets. In order to validate predictive capabilities on external dataset of the resulting models, both internal and external validations were performed. The division of dataset into both training and test sets was carried out by D-optimal design. The results showed that support vector machine (SVM) behaved well in both calibration and prediction. For the dataset of 48 bitter tasting dipeptides (BTD), the results obtained by support vector regression (SVR) were superior to that by PLS in both calibration and prediction. When compared with BP artificial neural network, SVR showed less calibration power but more predictive capability. For the dataset of angiotensin-converting enzyme (ACE) inhibitors, the results obtained by support vector machine (SVM) re- gression were equivalent to those by PLS and BP artificial neural network. In both datasets, SVR using linear kernel function behaved well as that using radial basis kernel func- tion. The results showed that there is wide prospect for the application of support vector machine (SVM) into QSAR modeling.展开更多
An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships(QSRR) models.First,the primary base sequences of oligonucleo...An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships(QSRR) models.First,the primary base sequences of oligonucleotides are translated into vectors based on scores of generalized base properties(SGBP),involving physicochemical,quantum chemical,topological,spatial structural properties,etc.;thereafter,the sequence data are transformed into a uniform matrix by auto cross covariance(ACC).ACC accounts for the interactions between bases at a certain distance apart in an oligonucleotide sequence;hence,this method adequately takes the neighboring effect into account.Then,a genetic algorithm is used to select the variables related to chromatographic retention behavior of oligonucleotides.Finally,a support vector machine is used to develop QSRR models to predict chromatographic retention behavior.The whole dataset is divided into pairs of training sets and test sets with different proportions;as a result,it has been found that the QSRR models using more than 26 training samples have an appropriate external power,and can accurately represent the relationship between the features of sequences and structures,and the retention times.The results indicate that the SGBP-ACC approach is a useful structural representation method in QSRR of oligonucleotides due to its many advantages such as plentiful structural information,easy manipulation and high characterization competence.Moreover,the method can further be applied to predict chromatographic retention behavior of oligonucleotides.展开更多
A new set of descriptors,namely score vectors of the zero dimension,one dimension,two dimensions and three dimensions(SZOTT),was derived from principle component analysis of a matrix of 1369 structural variables inclu...A new set of descriptors,namely score vectors of the zero dimension,one dimension,two dimensions and three dimensions(SZOTT),was derived from principle component analysis of a matrix of 1369 structural variables including 0D,1D,2D and 3D information for the 20 coded amino acids. SZOTT scales were then used in cleavage site prediction of human immunodeficiency virus type 1 protease. Linear discriminant analysis(LDA) and support vector machines(SVM) were applied to developing models to predict the cleavage sites. The results obtained by linear discriminant analysis(LDA) and support vector machines(SVM) are as follows. The Matthews correlation coefficients(MCC) by the resubstitution test,leave-one-out cross validation(LOOCV) and external validation are 0.879 and 0.911,0.849 and 0.901,0.822 and 0.846,respectively. The receiver operating characteristic(ROC) analysis showed that the SVM model possesses better simulative and predictive ability in comparison with the LDA model. Satisfactory results show that SZOTT descriptors can be further used to predict cleavage sites of human immunodeficiency virus type 1 protease.展开更多
Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influe...Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.展开更多
文摘醛糖还原酶与糖尿病并发症的发生有关,常见的糖尿病治疗药物常以该酶为作用靶标,但这些药物在应用中却存在毒副作用大的问题,寻找安全的醛糖还原酶抑制剂是目前功能性食品和医药研究的热点。源于植物的黄酮化合物虽具有较强的醛糖还原酶抑制活性,但其作用机制仍不明晰。鉴于此,本研究旨在运用分子模拟手段研究黄酮抑制醛糖还原酶的三维定量构效关系及作用模式。采用基于R基团搜索技术的比较分子场法(Topomer Co MFA)建立了39个类黄酮分子抑制醛糖还原酶的三维定量构效关系模型,并用包括12个样本的测试集验证模型的外部预测能力。所得模型的拟合、交互验证以及外部验证的相关系数分别为0.831,0.564和0.794。在此基础上,运用Surflex-dock分子对接法研究了黄酮与醛糖还原酶的结合模式。结果表明黄酮构型不同导致其在酶疏水性空腔中的取向不同,进而引起活性差异。当黄酮上的取代基分布符合立体场和静电场的修饰原则时,可显著改善黄酮与酶的结合,提高其醛糖还原酶抑制效果。对于开发新型的醛糖还原酶抑制剂,推动黄酮在功能性食品领域的应用具有一定的指导意义。
基金This work was supported by the Fok-Yingtung Educational Foundation(FYEF)(Grant No.98-7-6)the National Chun-hui Project Foundation(NCPF)(Grant No.99-04+99-37)Chongqing Applied Fundamental Science Fund(CAFS)(Grant No.01-3-6).
文摘Support vector machine (SVM), partial least squares (PLS), and Back-Propagation artificial neural net- work (ANN) were employed to establish QSAR models of 2 dipeptide datasets. In order to validate predictive capabilities on external dataset of the resulting models, both internal and external validations were performed. The division of dataset into both training and test sets was carried out by D-optimal design. The results showed that support vector machine (SVM) behaved well in both calibration and prediction. For the dataset of 48 bitter tasting dipeptides (BTD), the results obtained by support vector regression (SVR) were superior to that by PLS in both calibration and prediction. When compared with BP artificial neural network, SVR showed less calibration power but more predictive capability. For the dataset of angiotensin-converting enzyme (ACE) inhibitors, the results obtained by support vector machine (SVM) re- gression were equivalent to those by PLS and BP artificial neural network. In both datasets, SVR using linear kernel function behaved well as that using radial basis kernel func- tion. The results showed that there is wide prospect for the application of support vector machine (SVM) into QSAR modeling.
基金supported by the National Natural Science Foundation of China (10901169)National 111 Programme of Introducing Talents of Discipline to Universities (0507111106)+2 种基金Innovation Ability Training Foundation of Chongqing University (CDCX008)Innovative Group Program for Graduates of Chongqing University,ScienceInnovation Fund (200711C1A0010260)
文摘An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships(QSRR) models.First,the primary base sequences of oligonucleotides are translated into vectors based on scores of generalized base properties(SGBP),involving physicochemical,quantum chemical,topological,spatial structural properties,etc.;thereafter,the sequence data are transformed into a uniform matrix by auto cross covariance(ACC).ACC accounts for the interactions between bases at a certain distance apart in an oligonucleotide sequence;hence,this method adequately takes the neighboring effect into account.Then,a genetic algorithm is used to select the variables related to chromatographic retention behavior of oligonucleotides.Finally,a support vector machine is used to develop QSRR models to predict chromatographic retention behavior.The whole dataset is divided into pairs of training sets and test sets with different proportions;as a result,it has been found that the QSRR models using more than 26 training samples have an appropriate external power,and can accurately represent the relationship between the features of sequences and structures,and the retention times.The results indicate that the SGBP-ACC approach is a useful structural representation method in QSRR of oligonucleotides due to its many advantages such as plentiful structural information,easy manipulation and high characterization competence.Moreover,the method can further be applied to predict chromatographic retention behavior of oligonucleotides.
基金Supported by the Research on National High-tech R&D Program (the 863 program) (Grant No. 2006AA02Z312)Innovative Group Program for Graduates of Chong- qing University, Science and Innovation Fund (Grant No. 200711C1A0010260)
文摘A new set of descriptors,namely score vectors of the zero dimension,one dimension,two dimensions and three dimensions(SZOTT),was derived from principle component analysis of a matrix of 1369 structural variables including 0D,1D,2D and 3D information for the 20 coded amino acids. SZOTT scales were then used in cleavage site prediction of human immunodeficiency virus type 1 protease. Linear discriminant analysis(LDA) and support vector machines(SVM) were applied to developing models to predict the cleavage sites. The results obtained by linear discriminant analysis(LDA) and support vector machines(SVM) are as follows. The Matthews correlation coefficients(MCC) by the resubstitution test,leave-one-out cross validation(LOOCV) and external validation are 0.879 and 0.911,0.849 and 0.901,0.822 and 0.846,respectively. The receiver operating characteristic(ROC) analysis showed that the SVM model possesses better simulative and predictive ability in comparison with the LDA model. Satisfactory results show that SZOTT descriptors can be further used to predict cleavage sites of human immunodeficiency virus type 1 protease.
基金Foundations of National High Technology (863) Programme (Grant No. 2006AA02Z312)Innovative Group Programme for Graduates of Chongqing Uni-versity, Science and Innovation Fund (Grant No. 200711C1A0010260)+4 种基金National 111 Programme Introducing Talents of Discipline to Universities (Grant No. 0507111106)Chongqing Municipality Basic and Applied Fundamental Science Fund (Grant No. 01-3-6)National Chunhui Project Foundation (Grant No. 99-4-4+3-7)State Key Laboratory of Chemo/Biosensing and Chemometrics Fund (Grant No.2005012)Fok-Yingtung Educational Foundation (Grant No. 98-7-6)
文摘Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.