摘要
发展了一种基于分子相互识别的蛋白质分类方法,应用数据挖掘策略与统计学聚类,根据辅酶A(coenzyme-A,CoA)结合蛋白的结合模式特征数据,通过对比和分析多种分类方法对该体系的分类准确度,对这类体内重要的蛋白进行了分类方法学研究,选择了最优的两步聚类法.本研究工作设计和建立了一个分类参数,可以简洁有效地评价出各个结合特征的显著性与重要性,并以此为依据从所有特征中筛选出决定性的特征变量.研究结果所得到的CoA结合蛋白的三个分类,都具有显著的氢键与疏水结合特征;CoA可以与多个生物活性关键氨基酸残基形成氢键作用.这些相互作用的共性及分类上的差异,说明了配体与不同受体相互作用过程中结合模式上的细微差别,对于以CoA结合蛋白为靶点的选择性调控分子设计具有重要的参考意义与指导作用.
This study developed a mutual recognition of the proteins based on molecular classification, data mining strategies and the statistical clustering method, which was applied to study and classify clusters of coenzyme-A (CoA) binding proteins with their binding patterns extracted by using Pocket1.0 program. Several strategies have been evaluated for the accuracy of the system analysis and the two-step clustering method has been shown to be the best. The results revealed that the known CoA binding proteins can be clustered into three groups by using this approach. The designed classification coefficient was used effectively to identify the critical features for classification. The results show that both hydrogen bonds and hydrophobic interactions are important in all three clusters and that quite a few important residues related to biological activities are involved in the formation of hydrogen bonds. The classification of these interactions and the discovery of the characteristics and differences between the three clusters will have some utility for the design of specific agonists and antagonists.
出处
《物理化学学报》
SCIE
CAS
CSCD
北大核心
2011年第5期1223-1231,共9页
Acta Physico-Chimica Sinica
基金
重大新药创制国家科技重大专项(2009ZX09501-002)
国家自然科学基金(20802006)资助项目~~