期刊文献+

蛋白质数据库中匹配间隙序列标签的自动机算法

An automata approach to match gapped sequence tags against protein database
原文传递
导出
摘要 对于肽和蛋白质的分析鉴别,串联质谱(MS/MS)是极其重要的方法。解释MS/MS数据的一种方法是de novo序列,它正变得越来越准确和重要了。但de novo序列通常只能准确地判定序列的一部分,而对于不确定的部分只能通过“质量间隙”来表示,我们称这样部分确定的序列为间隙序列标签。对于蛋白质的分析鉴别,当在数据库中查询一个间隙序列标签时,其中确定的部分应与数据库蛋白质序列完全匹配,而对于每一个质量间隙也应匹配一个氨基酸子串,这些氨基酸子串的质量和应与质量间隙的质量和相等。在这种情况之下,标准的串匹配算法已经不再适用。在本文中,我们将提出一个新的且有效的算法,用以在蛋白质数据库中找到与间隙序列标签所匹配的序列。 Tandem mass spectrometry (MS/MS) is the most important method for the peptide and protein identification. One approach to interpret the MS/MS data is de novo sequencing, which is becoming more and more accurate and important, de novo sequencing usually can only confidently determine partial sequences, while the undetermined parts are represented by “mass gaps”. We call such a partially determined sequence a gapped sequence tag. When a gapped sequence tag is searched in a database for protein identification, the determined parts should match the database sequence exactly, while each mass gap should match a substring of amino acids whose masses total up to the value of the mass gap. In such a case, the standard string matching algorithm does not work any more. In this pa- per, we present a new efficient algorithm to find the matches of gapped sequence tags in a protein database.
作者 张涛
出处 《计算机与应用化学》 CAS CSCD 北大核心 2005年第10期845-850,共6页 Computers and Applied Chemistry
基金 国家863计划重大专项资助项目(2002AA103061)国家自然科学基金资助项目(10171099)
关键词 串联质谱(MS/MS) AHO-CORASICK算法 质量间隙序列标签 mass spectrometry (MS/MS) , aho-corasick algorithm,gapped sequence tag
  • 相关文献

参考文献24

  • 1Aebersold R and Mann M. Mass spectrometry-based proteomics.Nature, 2003, 422:198 - 207.
  • 2Castelo AT, Wellington Martins and Gao GR. TROLL-tandem repeat occurrence locator. Bioinformatics, 2002, 18:634 -636.
  • 3Altschul SF, et al. Basic local alignment search tool. J Mol Biol,1990, 215:403 -410.
  • 4Bartels C. Fast algorithm for peptide sequencing by mass spectroscopy. BiomedEnviron. Mass Spectrom, 1990, 19:363 - 368.
  • 5Brudno M, et al. Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinformatics, 2003, 4:66.
  • 6Chen T, et al. A dynamic programming approach to de novo peptide sequencingvia tandem mass pectrometry. J Comp Biology, 2001, 8(3) :325 -337.
  • 7Dan · CV, et al. De novo protein sequencing via tandem massspectrometry. J Comp Biology, 1999, 6:327 - 341.
  • 8Eng JK, McCormack AL and Yates JR. An approach to correlate tandem massspectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom, 1994, 5:976 -989.
  • 9Fernandez-de-Cossio J, et al. Automated interpretation of high-energy collision-induced dissociation spectra of singly-protonated peptides by SeqMS, a softwareaid for de novo sequencing by MS/MS.Rapid Commun. Mass Spectrom, 1998, 12:1867 - 1878.
  • 10Hines WM, et al. Pattern-based algorithm for peptide sequencing from tandemhigh energy collision-induced dissociation mass spectra.J Am Sco Mass Spectrom, 1992, 3:326 -336.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部