期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Gclust:A Parallel Clustering Tool for Microbial Genomic Data
1
作者 Ruilin Li Xiaoyu He +15 位作者 Chuangchuang Dai Haidong Zhu Xianyu Lang Wei Chen Xiaodong Li Dan Zhao Yu Zhang Xinyin Han Tie Niu Yi Zhao rongqiang cao Rong He Zhonghua Lu Xuebin Chi Weizhong Li Beifang Niu 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2019年第5期496-502,共7页
The accelerating growth of the public microbial genomic data imposes substantial burden on the research community that uses such resources.Building databases for non-redundant reference sequences from massive microbia... The accelerating growth of the public microbial genomic data imposes substantial burden on the research community that uses such resources.Building databases for non-redundant reference sequences from massive microbial genomic data based on clustering analysis is essential.However,existing clustering algorithms perform poorly on long genomic sequences.In this article,we present Gclust,a parallel program for clustering complete or draft genomic sequences,where clustering is accelerated with a novel parallelization strategy and a fast sequence comparison algorithm using sparse suffix arrays(SSAs).Moreover,genome identity measures between two sequences are calculated based on their maximal exact matches(MEMs).In this paper,we demonstrate the high speed and clustering quality of Gclust by examining four genome sequence datasets.Gclust is freely available for non-commercial use at https://github.com/niu-lab/gclust.We also introduce a web server for clustering user-uploaded genomes at http://niulab.scgrid.cn/gclust. 展开更多
关键词 MICROBIAL genome clustering PARALLELIZATION Sparse SUFFIX array MAXIMAL exact MATCH SEGMENT extension
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部