期刊文献+

CUDA-TP:基于GPU的自顶向下完整蛋白质鉴定并行算法 被引量:1

CUDA-TP:A GPU-Based Parallel Algorithm for Top-Down Intact Protein Identification
下载PDF
导出
摘要 蛋白质及蛋白质翻译后修饰(post-translational modifications,PTMs)的鉴定是蛋白质组学研究的基础,对整个领域的进一步发展有着十分重要的意义.近年来,质谱设备的快速发展使得获取"自顶向下"(top-down,TD)的高精度完整蛋白质质谱数据成为可能.目前基于TD质谱数据的完整蛋白质鉴定算法虽然在匹配精度、PTM位点的推断上取得了一些成效,但它们运行时间还有很大的不足和提升空间.利用图形处理器(graphics processing unit,GPU)可以将大规模的重复计算并行化,提高串行程序的执行速度.CUDA-TP算法基于通用并行计算架构(compute unified device architecture,CUDA)来计算蛋白质与TD质谱数据的匹配分数.首先,对每一个质谱数据,CUDA-TP利用优化的MS-Filter算法在蛋白质数据库中过滤出其对应的少数候选蛋白质集合,然后通过AVL(adelson-velskii and landis)树加速质谱匹配过程.GPU中的多线程技术被用来并行化谱图网格及最终数组中所有元素的前驱结点的求解.同时,该算法还使用target-decoy策略来控制蛋白质与质谱图匹配结果的错误发现率(false discovery rate,FDR).实验结果表明:CUDA-TP算法能够有效地加速完整蛋白质的鉴定,速度分别比MS-TopDown和MS-Align+快10倍与2倍.到目前为止,这是唯一能够利用CUDA架构来加速完整蛋白质鉴定的研究工作.CUDA-TP源代码公布在https://github.com/dqiong/CUDA-TP. Identifying proteins and their post-translational modifications are critical to the success of proteomics.Recent advances in mass spectrometry(MS)instrumentation have made it possible to generate high-resolution mass spectra of intact proteins.The existing algorithms for identifying proteins from top-down MS data are able to achieve good performance with respect to proteinspectrum matching precision and prediction accuracy of PTM locations,but their efficiencies in terms of running time are still far from satisfactory.Graphics processing unit(GPU)can be applied to parallelize large-scale replication computations and reduce the running time of serial programs.Based on compute unified device architecture(CUDA),this paper proposes an algorithm called CUDA-TP for computing alignment scores between proteins and mass spectra.Firstly,CUDA-TP uses the optimized MS-Filter algorithm to quickly filter out proteins in the database that cannot possibly attain high score for a given mass spectrum,thus only a small number of candidate proteins are obtained.Then,an AVL tree is introduced into the algorithm to speed up the computation of protein-spectrum matching.Multi-thread technique on GPU is applied to get the previous diagonal points of all nodes in the spectra grid created from mass spectra and proteins as well as the final array.Meanwhile,this algorithm utilizes target-decoy approach to control false discovery rate(FDR)of proteins and mass spectral matching results. Experimental results demonstrate that CUDA-TP can significantly accelerate protein identification such that its running time is about 10 times and 2 times faster than that of MS-TopDown and MS-Align+.To our knowledge,there are still no existing methods in the literature that can perform protein identification from top-down spectra using CUDA architecture.The source codes of the algorithm are available at https://github.com/dqiong/CUDA-TP.
作者 段琼 田博 陈征 王洁 何增有 Duan Qiong;Tian Bo;Chen Zheng;Wang Jie;He Zengyou(School of Software,Dalian University of Technology,Liaoning 116620;Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province(Dalian University of Technology),Dalian,Liaoning 116620)
出处 《计算机研究与发展》 EI CSCD 北大核心 2018年第7期1525-1538,共14页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61572094) 中央高校基本科研业务费专项资金(DUT14QY07)~~
关键词 “自顶向下”蛋白质组学 蛋白质鉴定 图形处理器 通用并行计算架构 谱图比对 top down proteomics protein identification graphics processing unit (GPU) compute unified device architecture(CUDA) spectral alignment
  • 相关文献

参考文献5

二级参考文献156

  • 1孙瑞祥,董梦秋,迟浩,杨兵,秀丽蕴,王乐珩,付岩,贺思敏.基于电子捕获裂解/电子转运裂解串联质谱技术的蛋白质组学研究[J].生物化学与生物物理进展,2010,37(1):94-102. 被引量:15
  • 2孙瑞祥,付岩,李德泉,张京芬,王晓彪,盛泉虎,曾嵘,陈益强,贺思敏,高文.基于质谱技术的计算蛋白质组学研究[J].中国科学(E辑),2006,36(2):222-234. 被引量:15
  • 3张庆丹,戴正华,冯圣中,孙凝晖.基于GPU的串匹配算法研究[J].计算机应用,2006,26(7):1735-1737. 被引量:15
  • 4NA S J,JEONG J H,PARK H J,et al.Unrestrictive identification of multiple post-translational modifications from tandem mass spectrometry using an error-tolerant algorithm based on an extended sequence tag approach[J].Molecular and Cellular Proteomics,2008,7(12):2452-2463.
  • 5TSUR D,TANNER S,ZANDI E,et al.Identification of post-translational modifications via blind search of mass-spectra[J].Nature Biotechnology,2005,23:1562-1567.
  • 6FRANK A M.Algorithms for tandem mass spectrometry-based proteomics[D].San Diego:University of California,2008.
  • 7MANAVSKI S A,VALLE G.CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment[J].BMC Bioinformatics,2008,9(Suppl 2):S10.
  • 8NVIDIA Corporation.NVIDIA CUDA Programming Guide version 2.3.1[R].2009.
  • 9FESTER T,SCHREIBER F,STRICKERT M.CUDA-based multi-core implementation of MDS-based bioinformatics algorithms[C]//Proc of German Conference on Bioinformatics.2009:67-79.
  • 10NVIDIA Corporation.Tesla BIO Workbench-助力新型科学[EB/OL].[2010-03-11].http://www.nvidia.cn/object/tesla_bio_workbench_cn.html.

共引文献14

同被引文献2

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部