期刊文献+

基于定制协处理器的基因重测序加速技术研究

A Customized Coprocessor Acceleration of Genome Re-Sequencing
下载PDF
导出
摘要 自2008年1月高通量测序技术应用以来,测序的通量和成本都在不断下降.然而基因数据的爆发式增长速度已经超过了摩尔定律,对海量数据的计算处理能力成为制约基因测序应用推广的瓶颈.以基于Hash索引的重测序算法为目标,对计算和访存行为进行分析,从而提出了一个现场可编程门阵列(field programmable gate array,FPGA)作为协处理器的架构,并在Convey公司的HC-1ex平台上进行了设计与实现.其基本处理单元内部采用全流水的设计及FIFO隔离计算模块和访存模块,可以完整执行重测序算法的核心流程.通过将基本处理单元和访存端口的一对一绑定,在4块Xilinx Virtex-6LX760上实现了64路并行处理流程,总平均读内存带宽可达22.59GBps.与8核Intel Xeon处理器相比,可以提升28.5倍的性能. Since January 2008 when the next-generation DNA sequencing platforms were developed,the sequencing throughput has been significantly improved.However,this technology has been challenged by the large amount of sequencing data which grows dramatically even over the Moore's Law.As an emerging data-intensive workload,the high-throughput re-sequencing tools like Hashbased programs shows different characteristics from traditional computational applications.Both low arithmetic intensity and irregular memory access pattern are major sources of inefficiency on commodity multi-core platforms.In this paper,we propose co-processor architecture for accelerating a short reads mapping algorithm.The complete mapping flow in one processing element(PE)is integrated to an exclusive memory port to improve the parallel performance.This proposed architecture is then implemented on a Convey HC-1ex reconfigurable computer.The design includes64 parallel PEs on 4Xilinx Virtex-6LX760 that operate at 150 MHz.Compared with an Intel Xeon8-cores CPU,the speedup achieves 28.5times,and the average memory read bandwidth achieves22.59 GBps.Therefore,this proposed design can potentially supply a solution to the large-amount data challenge and be applied in high throughput re-sequencing.
出处 《计算机研究与发展》 EI CSCD 北大核心 2014年第9期1980-1992,共13页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展计划基金项目(2012CB316502)
关键词 高通量测序技术 短序列比对 Hash索引 现场可编程门阵列 异构体系结构 high-throughput sequencing short reads mapping Hash-index field programmable gate array(FPGA) heterogeneous architecture
  • 相关文献

参考文献15

  • 1National Human Genome Research Institute. All about the human genome project [OL]. [2010-12-20]. http://www. genome, gov/10001772.
  • 2Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J]. Genome Biology, 2009, 10(3): 1-10.
  • 3Helicos Heliscnpe. HeliScope single molecule sequencer [OL]. [2011-03-03]. http://www, helieosbio, com.
  • 4Smith T F, Waterman M S. Identification of common molecular subsequences [J]. Journal of Molecular Biology, 1981, 147(1): 195-197.
  • 5Intel. Intel VTuneTM Amplifier XE 2013 [OL]. [2013-01- 11]. http://software, intel, com/en-us/intel-vtune-amplifier- xe/.
  • 6Nallatech. Intel Xeon FSB FPGA accelerator module [OL]. 2010-07-051- http://www, nallatech, com/Intel Xeon-FSB- Socket-Fillers/fsb development-systems, html.
  • 7张阳,窦勇,夏飞.生物信息学双序列比对算法加速器设计与实现[J].计算机科学与探索,2008,2(5):519-528. 被引量:7
  • 8Rumble SM, Lacroute P, Dalea AV, et al. SHRIMP: Accurate mapping of short color space reads [J]. PI.OS: Computational Biology, 2009, 5(5): 1-11.
  • 9Miller J R, Koren S, Sutton G. Assembly algorithms for next generation sequencing data [J]. Genomics, 2010, 95 (6) : 315-327.
  • 10Li H, Homer N. A survey of sequence alignment algorithms for next generation sequencing [J]. Briefings in Bioinformatics, 2010, 11(5): 473-483.

二级参考文献16

  • 1张佩珩,刘新春,江先阳.一种面向生物信息学的可重构加速卡的设计与实现[J].计算机研究与发展,2005,42(6):930-937. 被引量:5
  • 2S.B. Needleman, C. D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 1970, 48: 443~453
  • 3T.F. Smith, M. S. Waterman. Identification of common molecular subsequences. Journal of Molecular Biology, 1981, 147(1): 195~197
  • 4Richard J. Lipton, Daniel Lopresti. A systolic array for rapid string comparison. In: H. Fuchs, ed. Chapel Hill Conf. VLSI.Rockville, MD: Computer Science Press, 1985. 363~376
  • 5C.T. White, R. K. Singh, et al. Bioscan: A VLSI-based system for biosequence analysis. In: 1991 IEEE International Conf. Computer Design: VLSI in Computers and Processors. Los Alamitos: IEEE Computer Society Press, 1991. 504~509
  • 6D.T. Hoang. A systolic array for the sequence alignment problem. Brown University, Providence, RI, Tech Rep: CS-92-22, 1992
  • 7D.T. Hoang. Searching genetic databases on splash 2. In: Proc.IEEE Workshop on FPGAs for Custom Computing Machines. Los Alamitos, CA: CS Press, 1993. 185~192
  • 8J. D. Hirschberg, R. Hughey, K. Karplus. Kestrel: A programmable array for sequence analysis. In: Proc. Int' l Conf.Application-Specific Systems, Architectures and Processors (ASAP'96).Chicago, IL: IEEE CS, 1996. 25~34
  • 9Dominique Lavenier. SAMBA: Systolic Accelerators for Molecular Biological Applications, IRISA, Technical Report: 988, 1996
  • 10D. Lavenier. Dedicated Hardware for Biological Sequence Comparison. The Journal of Universal Computer Science, 1996, 2(2): 77~86

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部