期刊文献+

蛋白质数据库对蛋白质组鉴定的影响 被引量:3

Influence of Protein Databases in Proteomic Identification
下载PDF
导出
摘要 在蛋白质组学研究中,通常使用数据库检索算法进行蛋白质的鉴定。使用完整性较高但注释不准确的数据库,可能能够鉴定到更多的蛋白质,但存在数据不准确的风险;使用注释准确但完整性较低的数据库,则有可能漏掉一些数据库中未收录的蛋白。如何兼顾蛋白质鉴定结果的完整性和准确性是一个重要的问题。本研究以人类蛋白质组为例,采用不同质谱仪及不同样品产生的蛋白质组数据,比较了常用的IPI数据库、UniProt数据库和Swiss-Prot数据库的检索结果。结果表明,3个数据库在不同的蛋白质组数据中表现各有优劣,但总体来讲差异很小;每个数据库可鉴定到的、特有的多肽数不超过总数的5%,蛋白数的差异为1%~5%。说明3个数据库都覆盖了常见的人类蛋白序列,完整性很高。因此,推荐采用通过人工注释、在不断更新中的Swiss-Prot数据库作为检索对象。当研究目的为鉴定或定量未收录在Swiss-Prot数据库中的蛋白序列(如一些特殊的蛋白异构体或突变体)时,可将目的序列加入该数据库进行检索,或考虑使用其他完整性更高的数据库。 Database searching is a common strategy to identify proteins in current proteomic studies. In this strategy, searching against a highly comprehensive database might produce more protein identifications, but have the risk of incorrect database annotations. In contrast, using a more accurate database might loss some correct protein identifications that are not included in the database due to less database completeness. Achieving both completeness and accuracy in protein identification is an important problem. Taking human proteomic study as an example, this study compared database searching results of three commonly used protein databases (IPI database, UniProt database and Swiss-Prot database) on three proteomic datasets that were obtained from different biological samples and mass spectrometers. In general, although these databases performed differently on various proteomic data, the differences among them were not significant. For each database, no more than 5% of the total peptide identifications were not identified by the other two databases, while the differences of protein identifications ranged from 1% to 5%. This result indicates that all of the databases are with high completeness by covering most of the commonly identified proteins in human samples. Therefore, we recommend using Swiss-Prot database, a manually curated and continuously updated database, for routine human proteomic analysis. In addition, if the aim of a study to identify or quantify some special sequences that are not included in Swiss-Prot database, such as protein isoforms or mutations, researchers can add the target protein sequences to Swiss-Prot database, or use a more complete database instead.
作者 邵晨 孙伟
出处 《中国生物医学工程学报》 CAS CSCD 北大核心 2013年第2期129-134,共6页 Chinese Journal of Biomedical Engineering
基金 国家自然科学基金青年基金项目(31200614)
关键词 蛋白质数据库 蛋白质组学 数据库检索 protein database proteomics database searching
  • 相关文献

参考文献11

  • 1Eng JK, Searle BC, Clauser KR, et al. A face in the crowd: recognizing peptides through database search [J]. Mol Cell Proteomies, 2011, 10 ( 11 ) : R111 009522.
  • 2Kersey PJ, Duarte J, Williams A, et al. The International Protein Index: an integrated database for proteomics experiments [J]. Proteomies, 2004, 4(7):1985 -1988.
  • 3UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].Nucleic Acids Res, 2012, 40( Database Issue) :D71 - D75.
  • 4Nakamura Y, Cochrane G, Karsch-Mizrachi I. The international nucleotide sequence database collaboration [ J ]. Nucleic Acids Res, 2013, 41(D1) :D21 -D24.
  • 5Flicek P, Amode MR, Barrell D, et al. Ensembl 2012 [J]. Nucleic Acids Res, 2012, 40(Database Issue) :D84 -D90.
  • 6Pruitt KD, Tatusova T, Brown GR, et al. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy [J]. Nucleic Acids Res, 2012, 40 ( Database Issue) :D130 - 135.
  • 7Nesvizhskii AJ. A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics[ J ]. J Proteomics, 2010,73 ( 11 ) : 2092 - 2123.
  • 8Perkins DN, Pappin DJ, Creasy DM, et al. Probability-based protein identification by searching sequence databases using mass spectrometry data[J]. Electrophoresis, 1999, 20(18) :3551 - 3567.
  • 9Nesvizhskii AI, Aebersold R. Interpretation of shotgun proteomic data: the protein inference problem [J]. Mol Cell Proteomics, 2005, 4(10) :1419 - 1440.
  • 10Liu Xuejiao, Shao Chen, Wei Lilong, et al. An individual urinary proteome analysis in normal human beings to define the minimal sample number to represent the normal urinary proteome [J]. ProteomeSei, 2012, 10(1);70.

同被引文献33

引证文献3

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部