期刊文献+

构建适用于蛋白质家族分类的相似性网络

Construction of similarity networks appropriate for protein family classification
下载PDF
导出
摘要 图聚类用于蛋白质分类问题可以获得较好结果,其前提是将蛋白质之间复杂的相互关系转化为适当的相似性网络作为图聚类分类的输入数据。本文提出一种基于BLAST检索的相似性网络构建方法,从目标蛋白质序列出发,通过若干轮次的BLAST检索逐步从数据库中提取与目标蛋白质直接或间接相关的序列,构成关联集。关联集中序列之间的相似性关系即相似性网络,可作为图聚类算法的分类依据。对Pfam数据库中依直接相似关系难以正确分类的蛋白质的计算表明,按本文方法构建的相似性网络取得了比较满意的结果。 Graph clustering can get good results for protein classification on the basis of transformation from complex relations among the proteins into proper similarity networks that are used as the input data of the graph clustering. This paper,a method of similarity network construction based on BLAST search was presented,which was conducted several rounds of BLAST to find sequences related directly or indirectly to the target protein from database and gradually build up a correlation set. The similarity relations constituted the similarity network among sequences in the correlation set,which was served as the classification basis of the graph clustering algorithm. The method was tested for a selected set of proteins hardly correctly classified by means of only direct relations. The results showed that such similarity networks in conjunction with our graph clustering algorithm yielded accurate family assignments to most,namely,83. 2% of the proteins in the hard test set.
出处 《工业微生物》 CAS CSCD 2015年第3期53-57,共5页 Industrial Microbiology
关键词 相似性网络 图聚类 蛋白质家族分类 similarity networks graph clustering protein family classification
  • 相关文献

参考文献12

  • 1Finn RD, Bateman A, Clements J, et al. Pfam: the protein families database [ J ]. Nucleic Acids Res ,2014,42 ( D1 ) : IY222-D230.
  • 2Hubbard TJ, Mur'zin AG, Brenner SE, et al. SCOP: a structural classification of proteins database [J]. Nucleic Acids Research, 1997,25 ( 1 ) :236-239.
  • 3Girvan M,Ncwrnan MEJ. Community structure in social and bio- logical networks [ J]. Proceedings of the National Academy of Sci- ences,2002,99(12) :7821-7826.
  • 4Newman MEJ, Girvan M. Finding and evaluating community structure in networks [ J]. Physical review E, Statistical, nonlin- ear, and soft matter physics,2004,69(2 Pt 2).
  • 5时逢宽,李炜疆.序列相似性网络聚类与蛋白质家族划分[J].食品与生物技术学报,2014,33(1):98-103. 被引量:2
  • 6梅娟,王正祥,石贵阳,李炜疆.复杂生物网络分析的图聚类方法研究进展[J].食品与生物技术学报,2008,27(5):15-20. 被引量:6
  • 7梅娟,何胜,王正祥,石贵阳,李炜疆.基于网络模块性的蛋白质序列聚类[J].食品与生物技术学报,2010,29(1):123-127. 被引量:5
  • 8Mei J, Xiaojian Y, Weican Z. Revealing remote protein homology with sequence similarity and a modularity-based approach [ J']. Theor Biol Forum,2011,104( 1 ) :57-68.
  • 9Altsehul SF, Gish W, Miller W, et al. Basic local alignment search tool [ J ]. Journal of molecular biology, 1990,215 ( 3 ) :403-410.
  • 10Pearson WR, Lipman DJ. Improved tools for biological sequence comparison [ J]. Proceedings of the National Academy of Sciences of the United States of America, 1988,85 ( 8 ) : 2444-2448.

二级参考文献72

  • 1Barabasi A L, Oltvai Z N. Network biology: understanding the cell's functional organization[J]. Nat Rev Genet, 2004(5) : 101-113.
  • 2Watts D J, Strogatz S H. Collective dynamics of 'small-world' networks[J].Nature, 1998, 393: 440-442.
  • 3Albert R. Scale-free networks in cell biology[J]. J Cell Sci, 2005, 118.. 4947-4957.
  • 4Schuster S, Pfeiffer T, Moldenhauer F, et al. Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae[J].Bioinformaties, 2002, 18 : 351 - 361.
  • 5Berkhin P. A survey of clustering data mining techniques[J]. Grouping Multidimensional Data, 2006:25-71.
  • 6Ito T, Chiba T, Ozawa R, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome[J]. Proc Natl Acad Sci U S A, 2001, 98: 4569-4574.
  • 7Ho Y, Gruhler A, Heilbut A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry[J].Nature, 2002, 415: 180-183.
  • 8Pagel P, Kovac S, Oesterheld M, et al. The MIPS mammalian protein-protein interaction database[J]. Bioinformatics,2005, 21: 832-834
  • 9Xenarios I, Rice D W, Salwinski L, et al. DIP: the database of interacting proteins[J]. Nacleic Acids Res, 2000, 28:289 -291.
  • 10Bu D, Zhao Y, Cai L, et al. Topological structure analysis of the protein-protein interaction network in budding yeast [J]. Nucleic Acids Res, 2003, 31: 2443-2450.

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部