期刊文献+

基于网络模块性的蛋白质序列聚类 被引量:5

Clustering Protein Sequences through Modularity Optimization
下载PDF
导出
摘要 蛋白质的远同源性探测是结构基因组学和功能基因组学的主要研究任务之一。一些具有一定相似结构和功能、但序列相似性却较低的蛋白质组成蛋白质超家族,则远同源性探测问题等价于对蛋白质超家族的识别问题。作者提出了一种基于模块性的聚类算法ModuleFind,该方法通过最大化蛋白质网络的模块性来寻找具有较强集团结构的划分。在蛋白质结构分类数据库(SCOP)超家族层次上进行的实验表明,该方法得到的聚类结果更接近分类基准,且具有较高的F-测度值。 Remote homology detection between protein sequences is one of the principal research objectives in structural and functional genomics.Proteins with similar structure and function but low sequence similarity consist of protein superfamily.Therefore,the detection of remote homologues is the task of identifying protein superfamily.In this manuscript,a clustering algorithm,called ModuleFind,based on network modularity was presented.The method maximizes the modularity of protein network to find the partitioning with strong community structure.The resulting algorithm gives high quality of clusters quantified by F-measure that combines precise and recall,in the experiments of the detection of the remote homologues based on the superfamily level of SCOP database.
出处 《食品与生物技术学报》 CAS CSCD 北大核心 2010年第1期123-127,共5页 Journal of Food Science and Biotechnology
基金 国家863计划项目(2006AA020204)
关键词 蛋白质网络 序列相似性 远同源性 模块性 聚类 蛋白质结构分类数据库 protein network sequence similarity remote homology modularity clustering SCOP
  • 相关文献

参考文献16

  • 1Nikolski M, Sherman D J. Family relationships: should consensus reign-consensus clustering for protein families[J]. Bioinformatics, 2007, 23(2): 71-76.
  • 2Altschul S F, Gish W, Miller W, et al. Basic local alignment search tool[J]. J Mol Biol, 1990, 215(3) : 403-410.
  • 3Pearson W R, Lipman D J. Improved tools for biological sequence eomparison[J]. Proc Natl Acad Sci USA, 1988, 85(8) : 2444-2448.
  • 4Durbin R, Eddy S R, Krogh A, et al. Biological Sequence Analysis[M]. Cambridge: Cambridge University Press, 1998.
  • 5Smith T F, Waterman M S. Identification of common molecular subsequences[J]. J Mol Biol, 1981, 147(1): 195-197.
  • 6Aittokallio T, Schwikowski B. Graph based methods for analysing networks in cell biology[J]. Brief Bioinform, 2006, 7 (3): 243-255.
  • 7梅娟,王正祥,石贵阳,李炜疆.复杂生物网络分析的图聚类方法研究进展[J].食品与生物技术学报,2008,27(5):15-20. 被引量:6
  • 8Paccanaro A, Casbon J A, Saqi M A. Spectral clustering of protein sequences[J]. Nucleic Acids Res, 2006, 34(5) : 1571- 1580.
  • 9Enright A J, Van Dongen S, Ouzounis C A. An efficient algorithm for large-scale detection of protein families[J]. Nucleic Acids Res, 2002,30(7): 1575-1584.
  • 10Liao L, Noble W S. Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships[J]. J Comput Biol, 2003, 10(6): 857-868.

二级参考文献36

  • 1Barabasi A L, Oltvai Z N. Network biology: understanding the cell's functional organization[J]. Nat Rev Genet, 2004(5) : 101-113.
  • 2Watts D J, Strogatz S H. Collective dynamics of 'small-world' networks[J].Nature, 1998, 393: 440-442.
  • 3Albert R. Scale-free networks in cell biology[J]. J Cell Sci, 2005, 118.. 4947-4957.
  • 4Schuster S, Pfeiffer T, Moldenhauer F, et al. Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae[J].Bioinformaties, 2002, 18 : 351 - 361.
  • 5Berkhin P. A survey of clustering data mining techniques[J]. Grouping Multidimensional Data, 2006:25-71.
  • 6Ito T, Chiba T, Ozawa R, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome[J]. Proc Natl Acad Sci U S A, 2001, 98: 4569-4574.
  • 7Ho Y, Gruhler A, Heilbut A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry[J].Nature, 2002, 415: 180-183.
  • 8Pagel P, Kovac S, Oesterheld M, et al. The MIPS mammalian protein-protein interaction database[J]. Bioinformatics,2005, 21: 832-834
  • 9Xenarios I, Rice D W, Salwinski L, et al. DIP: the database of interacting proteins[J]. Nacleic Acids Res, 2000, 28:289 -291.
  • 10Bu D, Zhao Y, Cai L, et al. Topological structure analysis of the protein-protein interaction network in budding yeast [J]. Nucleic Acids Res, 2003, 31: 2443-2450.

共引文献5

同被引文献63

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部