摘要
给出了α型、β型、α/ β型、多域型蛋白质二级结构主序列六联体的分布规律 .提出了根据蛋白质二级结构主序列对蛋白质结构型进行识别 (分类 )的方法 .以蛋白质二级结构主序列三联体为参数 ,利用Mahalanobis距离方法对上述 4种结构型的蛋白质进行识别 ,分类的总体准确率为 81% ;以二级结构主序列中六联体的频数构成蛋白质结构的多样性源 ,利用多样性增量极小化对上述 4种结构型进行识别 ,分类的总体准确率为 83% .
The distribution of hexa structures in secondary structure sequences of different classes of proteins has been found. Based on this, two methods for the recognition of the structural class of a protein are proposed. The first is the method of Mahalanobis distance which is based on the frequencies of tri structures in secondary structure sequence. The second is the method of diversity measure which is based on the frequencies of hexa structures that are regarded as the source of diversity. The prediction has been done in a set of 1 130 proteins of four classes, namely α class, β class, α/β class and multi domain protein. The successful rates for two recognitions are about 81% and 83% respectively. The method introduced here also gives an approach to predict the compact structural domain of proteins.
出处
《生物化学与生物物理进展》
SCIE
CAS
CSCD
北大核心
2002年第6期938-941,共4页
Progress In Biochemistry and Biophysics
基金
国家自然科学基金资助项目 (3 9960 0 2 390 10 3 0 3 0 )~~