期刊文献+

序列比对算法中的BW变换索引技术研究及其改进 被引量:3

Research on BW Transform Index Technology in Sequence Alignment Algorithm and Its Improvement
下载PDF
导出
摘要 面向大规模长序列的序列比对问题是生物信息学中最重要的基础问题之一。针对序列比对算法的主流索引技术BW变换(BWT)进行研究,提出一种新的二阶BWT索引方法。与传统BWT方法的逐位索引查找不同,改进后的BWT方法按双位索引查找。实验结果表明,改进后的方法减少了序列比对算法中的循环遍历和计算次数,降低了序列比对算法中索引方法的复杂度,提高了查找效率,尤其适合长序列和大规模序列的索引和查找。 Sequence alignment of large-scale and long sequences is one of the most important and basic issues in bioinformatics.This paper focuses on Burrows-Wheeler Transform(BWT) which is the major index technology in sequence alignment algorithms and proposes a new second-order BWT index concept as well as its implementation.Different from the traditional BWT algorithm while searching with a single character,the algorithm can find two characters at one time.Experimental results show that the second-order BWT index algorithm can reduce the frequency of loop and calculation in sequence alignment algorithm.It can also reduce the alignment algorithm complexity by half and improve the search efficiency /especially for large-scale and long sequence s index and searching process.
出处 《计算机工程》 CAS CSCD 北大核心 2016年第1期282-286,共5页 Computer Engineering
基金 国家自然科学基金资助重点项目(61033009) 国家"111"计划基金资助项目(B07033)
关键词 序列比对 索引 BW变换索引 第二代测序 第三代测序 大规模长序列比对 sequence alignment index Burrows-Wheeler Transform(BWT) index next-generation sequencing third generation sequencing alignment of large-scale and long sequences
  • 相关文献

参考文献15

  • 1Mount D W.Bioinformatics:Sequence and Genome Analysis[M].Berlin,Germany:Springer,2002.
  • 2国宏哲,王亚东.基因组Mapping系统索引构建原理[J].智能计算机与应用,2012,2(4):47-49. 被引量:2
  • 3Altschul S F,Gish W,Miller W,et al.Basic Local Alignment Search Tool[J].Journal of Molecular Biology,1990,215(3):403-410.
  • 4Li Heng,Homer N.A Survey of Sequence Alignment Algorithms for Next-generation Sequencing[J].Briefings in Bioinformatics,2010,11(5):473-483.
  • 5Hach F,Hormozdiari F,Alkan C,et al.mrsFAST:A Cache-oblivious Algorithm for Short-read Mapping[J].Nature Methods,2010,7(8):576-577.
  • 6Li Ruiqiang,Li Yingrui,Kristiansen K,et al.SOAP:Short Oligonucleotide Alignment Program[J].Bio-informatics,2008,24(5):713-714.
  • 7Langmead B,Trapnell C,Pop M,et al.Ultrafast and Memory-efficient Alignment of Short DNA Sequences to the Human Genome[J].Genome Biology,2009,10(3).
  • 8Langmead B,Salzberg S L.Fast Gapped-read Alignment with Bowtie 2[J].Nature Methods,2012,9(4):357-359.
  • 9Abouelhoda M I,Kurtz S,Ohlebusch E.Replacing Suffix Trees with Enhanced Suffix Arrays[J].Journal of Discrete Algorithms,2004,2(1):53-86.
  • 10Li Ruiqiang,Yu Chang,Li Yingrui,et al.SOAP2:An Improved Ultrafast Tool for Short Read Alignment[J].Bioinformatics,2009,25(15):1966-1967.

二级参考文献13

  • 1Burrows M, Wheeler D J. A block-sorting lossless data compression algorithm[R]. Technical Report SRC 124, Digital Equipment Corporation, Palo Alto, California.
  • 2Nelson M. Data Compression with the Burrows-Wheeler Transform [J]. Dr. Dobb'S Journal of Software Tools, 1996,16(5) :46450.
  • 3Donald Adjeroh, Tim Bell, Amar Mukherjee. THE BURROWS-WHEELER TRANSFORM: Data Compression, Suffix Arrays, and Pattern Matching [M]. springer, com, 2008.
  • 4Ross Arnold, Tim Bell. A corpus for the evaluation of lossless compression algorithms[C]//Snowbird, Utah: Data compression conference, 1997.
  • 5Mohammad Hjouj Btoush, Jawed Siddiqi, Babak Akhgar. Observations on compressing text files of varying length [C]//Washingto, DC: Fifth International Conference on Information Technology: New Generations,2008.
  • 6LI H,RUAN J,DURBIN R. Mapping short DNA sequencing reads and calling variants using mapping quality scores[J].Genome Research,2008.1851-1858.
  • 7BURROWS M,WHEELER D J. A block-sorting lossless data compression algorithm.[Technical report124,][R].PaloAlto,CA,Digital Equipment Corporation,1994.
  • 8LIPPERT R A. Space-efficient whole genome comparisons with Burrows-Wheeler transforms[J].Journal of Computational Biology,2005.407-415.
  • 9K ARKKAINEN J. Fast BWT in small space by blockwise suffix sorting[J].Theoretical Computer Science,2007.249-257.
  • 10BAEZA-YATES R A,PERLEBERG C H. Fast and practical approximate string matching[J].Information Processing Letters,1996.21-27.

共引文献4

同被引文献21

引证文献3

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部