摘要
在对3种de novo(从头)序列拼接的基本策略进行分析的基础上,该文研究了混合策略序列拼接算法的构造过程,从而整合多个单一策略优点;再利用形式化方法和形式化平台方面的优势,结合领域分析建模和产生式编程的方法,构造了2个基于OLC策略的算法(OLC_assembly_1,OLC_assembly_2)及1个基于DBG策略的算法(DBG_assembly),进一步组装出在(OLC+DBG)→OLC混合模式下的算法(简称ODO算法);最后,从GenBank中选取了3个实验样本,从N50、Contigs number、Coverage等角度,比较了在3个单一策略下的算法和ODO构造算法的拼接结果,分析了coverage depth和k值的变化对拼接结果的影响.实验结果表明:该文实现的ODO算法比单一策略在序列拼接时所产生的结果在N50和Coverage等参数上均有一定的优势.
Based on the analysis of three basic strategies of de novo sequence assembly,namely greedy strategy,OLC(Overlap-Layout-Consensus)strategy and DBG(De Bruijn Graph)strategy,the construction process of hybrid strategy sequence assembly algorithm is studied,so as to integrate the advantages of multiple single strategies.Taking advantage of the team′s advantages in formal methods and platforms,combined with the methods of domain analysis modeling and generative programming,two algorithms based on OLC strategies(OLC_assembly_1,OLC_assembly_2)and an algorithm based on DBG(DBG-assembly)strategies are constructed,and the algorithms in the(OLC+DBG)→OLC hybrid mode(referred to as ODO algorithms)are further assembled.Finally,three experimental samples are selected from GenBank,and the stitching results of the algorithm and ODO construction algorithm under three single strategies are compared from the perspectives of N50,Contigs number,Coverage,etc.,and the effect of cover depth and k value change on the stitching result is analyzed.Experimental results show that the ODO algorithm implemented in this paper has certain advantages over the results of sequence assembly in terms of parameters such as N50 and Coverage.
作者
肖存威
石海鹤
王岚
程柏良
XIAO Cunwei;SHI Haihe;WANG Lan;CHENG Bailiang(School of Computer and Information Engineering,Jiangxi Normal University,Nanchang Jiangxi 330022,China)
出处
《江西师范大学学报(自然科学版)》
CAS
北大核心
2022年第3期300-307,共8页
Journal of Jiangxi Normal University(Natural Science Edition)
基金
国家自然科学基金(62062039,61662035)
江西省自然科学基金(20202BAB202024,20212BAB202017)资助项目.