摘要
生物信息数据的飞速增长需要新的技术引入到该学科,目前的基因组组装算法还存在着精度不高、并行化不足等缺点。对目前组装算法的分析后,提出了基于Map Reduce的组装算法,通过统计去除组装过程中的错误数据,通过增加k-mer的长度消除组装过程中的重复数据,最后在Map Reduce平台实现了并行组装算法,实验结果表明算法提高了组装的准确度和计算速度。
The rapid increase of biological information data requires the import of new technology. At present, the genome assembly algorithm is neither precised nor parallelize, A new algorithm based on MapReduce is proposed after analysis of the current assembly algorithm. The error data is removed through statistics way, and the duplicate data is eliminated by increasing the length of the k-mer in the process of assembly. Finally, the parallel assembly algorithm is realized in MapReduce platform. The experimental results show that the accuracy and speed of this algorithm are improved.
出处
《大理大学学报》
CAS
2016年第6期4-7,共4页
Journal of Dali University
基金
大理大学青年教师科研基金资助项目(KYQN201218)