摘要
面向大规模长序列的序列比对问题是生物信息学中最重要的基础问题之一。针对序列比对算法的主流索引技术BW变换(BWT)进行研究,提出一种新的二阶BWT索引方法。与传统BWT方法的逐位索引查找不同,改进后的BWT方法按双位索引查找。实验结果表明,改进后的方法减少了序列比对算法中的循环遍历和计算次数,降低了序列比对算法中索引方法的复杂度,提高了查找效率,尤其适合长序列和大规模序列的索引和查找。
Sequence alignment of large-scale and long sequences is one of the most important and basic issues in bioinformatics.This paper focuses on Burrows-Wheeler Transform(BWT) which is the major index technology in sequence alignment algorithms and proposes a new second-order BWT index concept as well as its implementation.Different from the traditional BWT algorithm while searching with a single character,the algorithm can find two characters at one time.Experimental results show that the second-order BWT index algorithm can reduce the frequency of loop and calculation in sequence alignment algorithm.It can also reduce the alignment algorithm complexity by half and improve the search efficiency /especially for large-scale and long sequence s index and searching process.
出处
《计算机工程》
CAS
CSCD
北大核心
2016年第1期282-286,共5页
Computer Engineering
基金
国家自然科学基金资助重点项目(61033009)
国家"111"计划基金资助项目(B07033)
关键词
序列比对
索引
BW变换索引
第二代测序
第三代测序
大规模长序列比对
sequence alignment
index
Burrows-Wheeler Transform(BWT) index
next-generation sequencing
third generation sequencing
alignment of large-scale and long sequences