A number of basic and applied questions in ecology and environmental management require the characterization of soil and leaf litter faunal diversity. Recent advances in high-throughput sequencing of barcode-gene ampl...A number of basic and applied questions in ecology and environmental management require the characterization of soil and leaf litter faunal diversity. Recent advances in high-throughput sequencing of barcode-gene amplicons ('metabarcoding') have made it possible to survey biodiversity in a robust and efficient way. However, one obstacle to the widespread adoption of this technique is the need to choose amongst many candidates for bioinformatic processing of the raw sequencing data. We compare three candidate pipelines for the processing of 18S small subunit rDNA metabarcode data from solid substrates: (i) USEARCH/CROP, (ii) Denoiser/UCLUST, and (iii) OCTUPUS. The three pipelines produced reassuringly similar and highly correlated assessments of community composition that are dominated by taxa known to characterize the sampled environments. However, OCTUPUS appears to inflate phylogenetic diversity, because of higher sequence noise. We therefore recommend either the USEARCH/CROP or Denoiser/UCLUST pipelines, both of which can be run within the QIIME (Quantitative Insights Into Microbial Ecology) environment.展开更多
基金supported by Yunnan Province (20080A001)Chinese Academy of Sciences (0902281081,KSCX2-YW-Z-1027)+2 种基金the National Natural Science Foundation of China (31170498)Ministry of Science and Technology of China (2012FY110800)Kunming Institute of Zoology,and the University of East Anglia
文摘A number of basic and applied questions in ecology and environmental management require the characterization of soil and leaf litter faunal diversity. Recent advances in high-throughput sequencing of barcode-gene amplicons ('metabarcoding') have made it possible to survey biodiversity in a robust and efficient way. However, one obstacle to the widespread adoption of this technique is the need to choose amongst many candidates for bioinformatic processing of the raw sequencing data. We compare three candidate pipelines for the processing of 18S small subunit rDNA metabarcode data from solid substrates: (i) USEARCH/CROP, (ii) Denoiser/UCLUST, and (iii) OCTUPUS. The three pipelines produced reassuringly similar and highly correlated assessments of community composition that are dominated by taxa known to characterize the sampled environments. However, OCTUPUS appears to inflate phylogenetic diversity, because of higher sequence noise. We therefore recommend either the USEARCH/CROP or Denoiser/UCLUST pipelines, both of which can be run within the QIIME (Quantitative Insights Into Microbial Ecology) environment.