摘要
介绍一种新的方法来预测蛋白质二级结构.该方法是基于4肽结构字的基础上,利用4肽结构字建立多样性源同时结合二次判别法来预测一个序列片段中心残基的二级结构,最后对预测后的结果进行修正.对1645个蛋白进行检验,其21残基片段中心残基,10折交叉检验的结果Q3**(Q3score)达到79.68%.当考虑长程序列信息时,预测将会更精确.与其它预测软件相比较,显示了一定的优势.
A novel method is proposed for predicting protein secondary seructure. Based on tetra-peptide structural words,the central residues for those sequence fragments are predicted using the method of diversity increment combined with quadratic discriminant analysis (for short IDQD). Finally,the correction program is given to the IDQD. The accuracy of Q3^** =79. 68% is achieved in the ten-fold cross-validated test for 21-residue fragments in the dataset of 1645 proteins. Moreover, the accuracy can be further improved by taking long-range sequence information into account in prediction. The results show that compared with the existing prediction methods, the method is more effectively on protein secondary structure prediction.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
北大核心
2008年第3期300-306,共7页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金资助项目(90403010)
关键词
蛋白质二级结构
4肽结构字
多样性增量
二次判别法
结构涨落
边界修正
长程关联
~ protein secondary structure ~ tetra-peptide structural word ~ increment of diversity protein secondary structure
tetra-peptide structural word
increment of diversityquadratic discriminant analysis
structure fluctuation
boundary correction
longrange interaction