摘要
识别蛋白质二级结构片段,实质上是从序列片段的水平上对二级结构进行预测,用离散量的方法,通过一系列的计算发现:选择氨基酸(20种氨基酸加一个空位)和其紧邻关联共同为参数,从N端截取固定长序列片段为6或7个连续氨基酸残基时,10交叉检验平均预测精度能达到75%~77%,jack—knife检验平均预测精度能达到72%;当固定长序列片段为9或10个连续氨基酸残基时,10交叉检验平均预测精度能达到82%~83%,jack—knife检验平均预测精度能达到74%~76%.
The recognition of secondary structure segments of proteins is,in substance ,a prediction of secondary structure at the level of segments. By use of the measure of diversity, a series of calculations are made. It is found that,with amino acid composition and dipeptide composition taken as parameters,in fixed-length sequence with 6 or 7 amino acids from N terminal,the average prediction accuracy of 10-cross validation test reaches 75%-77%, while with jack-knife test the average prediction accuracy is no more than 72%. In case the fixed-length sequence chosen contains 9 or 10 amino acids,the average prediction accuracy of 10-cross validation test will reach 82%,while with jack-knife test it will be 74%-76%.
出处
《内蒙古工业大学学报(自然科学版)》
2007年第4期246-251,共6页
Journal of Inner Mongolia University of Technology:Natural Science Edition
基金
国家自然科学基金资助项目30560039
内蒙古自然科学基金资助项目200508010509
关键词
离散量方法
蛋白质二级结构
序列片段
measure of diversity
secondary structure of protein
segment of sequence