摘要
针对P-BWT精确匹配算法存在只支持短串查询并且只能工作在单处理器上的问题,提出了一个多核并行的支持任意查询长度的精确查询算法.改进了P-BWT索引上的查询过程,当一个查询串跨越了多个数据分片时,首先在其匹配的最后一个分片上查询,然后依次在前面分片上进行验证.进一步提出了一个多核并行查询算法来减少搜索和验证过程的迭代次数.实验结果表明,所述算法可以高效并行地完成子串匹配任务.
In order to solve the problem that P-BWT (Burrows-Wheeler transform) could only support short queries, and work on a uniprocessor, a multi-core parallel exact matching algorithm was proposed which any query length could be supposed. Firstly, the search process on P-BWT index was modified. When a query spans multiple data fragments, it first searches on the last segment, then verifies on the other segments. Further, a parallel algorithm was proposed to reduce the iterations in the search and verify process. Finally, the experimental study show that using the proposed algorithm, the substring matching task could be accomplished efficiently in parallel manner.
出处
《东北大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2016年第5期624-628,共5页
Journal of Northeastern University(Natural Science)
基金
国家自然科学基金资助项目(61322208
61272178
61129002
61572122
61532021)
教育部高等学校博士学科点专项科研基金资助项目(20110042110028)
关键词
BWT
全文索引
精确匹配
并行
多核
BWT ( Burrows-Wheeler transform)
full text index
exact matching
parallel
multi-core