摘要
基于LZ复杂性距离提出了一种非比对的蛋白质三维结构比较方法.该方法以蛋白质结构单元间的条件LZ复杂性为特征参数,根据条件LZ复杂性计算LZ复杂性距离来作为蛋白质三维结构(不)相似程度的定量刻画.该方法可在二次多项式的时间限度内计算完成.蛋白质的结构数据采用接触图的表示方式,以避免PDB格式数据中的非结构信息和不同坐标系对结构比较的影响.以真实的蛋白质三维结构数据所组成的5个数据集为实例,基于LZ复杂性距离对各数据集中的蛋白质单链进行了结构聚类.聚类的结果符合各蛋白质单链在传统的结构分类数据库中的分类,表明论文提出的方法能够有效地对蛋白质三维结构进行定量比较.
Based on the IZ complexity distance metric, an alignment-free method for comparison of protein 3D structure, was proposed. The new method takes the conditional LZ complexity between protein structural units as the feature parameter. And the LZ complexity distance, which quantifies the (dis-)similarity of protein structures, was calculated according to the parameter. The method was solvable in quadratic polynomial time. Contact map was adopted to represent protein structure in this work so that the impact of non-structural information and different coordinate system brought from PDB format data could be neglected for comparison. Protein single chains were clustered based on the LZ complexity distance over five different data sets made of real protein molecules. Clustering results were shown to be in good agreement with the classification of these protein single chains in the classical structure classification database, which demonstrated that the proposed method can be effectively used for the quantitative comparison of protein three-dimensional structures.
出处
《高技术通讯》
CAS
CSCD
北大核心
2007年第7期742-748,共7页
Chinese High Technology Letters
基金
国家自然科学基金(60371046)资助项目.
关键词
生物信息学
蛋白质三维结构
结构比较
LZ复杂性距离
接触图
bioinfonnatics, protein 3D structure, structure comparison, LZ complexity distance, contact map