BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data 被引量：4

BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data

导出

摘要 The emergence of next-generation sequencing （NGS） technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Al- though there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn＇t taken into account the sequencing errors when dealing with the du- plicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http：//bigpre.sourceforge.net/. The emergence of next-generation sequencing （NGS） technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Al- though there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn＇t taken into account the sequencing errors when dealing with the du- plicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http：//bigpre.sourceforge.net/.

作者 Tongwu Zhang Yingfeng Luo Kan Liu Linlin Pan Bing Zhang Jun Yu Songnlan Hu

机构地区 CAS Key Laboratory of Genome Sciences and Information James D. Watson Institute of Genome Sciences

出处《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2011年第6期238-244,共7页 基因组蛋白质组与生物信息学报（英文版）

基金 supported by the National Natural Science Foundation of China (Grant No.31000561 and 30900825) the Knowledge Innovation Program of the Chinese Academy of Sciences (Grant No.KSCX2-EW-R-01-04)

关键词 next-generation sequencing quality assessment duplicate reads sequencing error next-generation sequencing, quality assessment, duplicate reads, sequencing error

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论] O144 [理学—基础数学]

引文网络
相关文献

参考文献17

1Metzker, M.L. 2010. Sequencing technologies - the next generation. Nat Rev. Genet. 11: 31-46.
2Ng, P.C. and Kirkness, E.F. 2010. Whole genome sequencing. Methods Mol. Biol. 628: 215-226.
3Schuster, S.C. 2008. Next-generation sequencing transforms today's biology. Nat. Methods 5: 16-18.
4Tucker, T., et aL 2009. Massively parallel sequencing: the next big thing in genetic medicine. Am. J. Hum. Genet. 85: 142-154.
5Schadt, E.E., et al. 2010. A window into third-generation sequencing. Hum. Mol. Genet. 19: R227-240.
6Bateman, A. and Quackenbush, J. 2009. Bioinformatics for next generation sequencing. Bioinformatics 25: 429.
7Dolan, P.C. and Denver, D.R. 2008. TileQC: a system for tile-based quality control of Solexa data. BMC Bioinformatics 9: 250.
8Cox, M.P., et al. 2010. SolexaQA: At-a-glance quality assessment of lllumina second-generation sequencing data. BMC Bioinformatics 11: 485.
9Martinez-Alcantara, A., et al. 2009. PIQA: pipeline for IUumina G1 genome analyzer data quality assessment. Bioinformatics 25: 2438-2439.
10Kozarewa, I., et al. 2009. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat. Methods 6: 291-295.

同被引文献23

1杨宗善,买双厚.尿路感染的中医药治疗概况[J].陕西中医,1989,10(1):40-41. 被引量：1
2刘东梅,毕建成,郄会卿,焦平,闫璞玲,孟志明.黄芩、黄连、乌梅、金银花、败酱草对产AmpC β-内酰胺酶细菌的体外抑菌作用[J].河北中医,2008,30(6):654-655. 被引量：16
3李滢,孙超,罗红梅,李西文,牛云云,陈士林.基于高通量测序454 GS FLX的丹参转录组学研究[J].药学学报,2010,45(4):524-529. 被引量：91
4吴铁松,谢展雄,吴建伟,吴丽霞.黄连、黄柏对耐万古霉素肠球菌抑菌活性的研究[J].赣南医学院学报,2010,30(3):344-346. 被引量：11
5Xiang Li Guang Hui Chen Wei Yang Zhang Xiansheng Zhang.Genome-wide transcriptional analysis of maize endosperm in response to ae wx double mutations[J].Journal of Genetics and Genomics,2010,37(11):749-762. 被引量：10
6丁晓媚,宁玉梅.黄连的药理研究进展[J].国际中医中药杂志,2011,33(2):184-186. 被引量：20
7李娟,李晓东,杨丽霞,姜华.单味中药体外抑菌活性的研究进展[J].中国实验方剂学杂志,2011,17(11):283-286. 被引量：49
8HAO DaCheng,MA Pei,MU Jun,CHEN ShiLin,XIAO PeiGen,PENG Yong,HUO Li,XU LiJia,SUN Chao.De novo characterization of the root transcriptome of a traditional Chinese medicinal plant Polygonum cuspidatum[J].Science China(Life Sciences),2012,55(5):452-466. 被引量：36
9李明玥.黄连提取工艺的研究[J].黑龙江医药,2012,25(4):557-559. 被引量：4
10云云,汪长中.中药抗耐药大肠埃希菌研究进展[J].中国微生态学杂志,2013,25(2):238-241. 被引量：9

引证文献4

1朱健铭,翁幸鐾,吴晋兰,姜如金.基于 RNA-seq 技术的黄连水煎液对多耐药尿道致病性大肠埃希菌的转录组学研究[J].中华微生物学和免疫学杂志,2015,35(10):776-782. 被引量：3
2朱健铭,翁幸鐾,姜如金,吴晋兰,贺子龙,姚娜.黄芩水煎剂对尿道致病性大肠埃希菌的转录组影响分析[J].中草药,2017,48(9):1791-1801. 被引量：13
3严维军,赵正宜,熊行创.人类基因组数据质量评估研究[J].计量科学与技术,2023,67(5):31-38. 被引量：1
4严维军,赵正宜,熊行创.基于Nextflow的人类基因组数据质量评估管道的设计与实现[J].中国计量,2024(6):112-115.

二级引证文献17

1赵圣明,赵岩岩,马汉军,别小妹.转录组学在抑菌机制中的应用研究进展[J].食品与发酵工业,2017,43(7):259-264. 被引量：7
2董亚萍,冯东岳,孙晶,张小明,胡鲲,杨先乐.连翘酯苷A对嗜水气单胞菌耐恩诺沙星的延缓效果及其外排作用[J].南方农业学报,2019,50(1):187-193. 被引量：4
3朱宁,于宁,朱月,韦玉龙,张嘉颖,孙爱东.基于转录组研究MPEF对毕赤酵母的致死机理[J].食品科学,2019,40(4):130-137. 被引量：2
4陈修保,陈泽涛.基于AS形成机制探讨清热解毒治疗冠心病[J].时珍国医国药,2019,30(1):153-155. 被引量：13
5王舞妮,赵俊文,邓彩弟,翁子梅.薄荷复方煎液对龋病及牙周病常见致病菌生理活性的抑制作用[J].中国当代医药,2019,26(20):109-112.
6陈燕晴,韦英益,梁万文,施君,胡庭俊.三黄连散对罗非鱼无乳链球菌的体外抑菌效果观察[J].广西畜牧兽医,2019,35(5):205-208. 被引量：4
7严维花,曹虹虹,郭爽,成铭,白德涛,陈杰,毛春芹,李林,陆兔林.基于荧光定量PCR技术的当归不同炮制品中大肠埃希菌数量测定方法开发及比较研究[J].中草药,2020,51(9):2427-2435. 被引量：3
8张雪宁,马方芳,郑艳秋,陈建真,程汝滨.苦寒类中药抑菌作用及机制的现状与思考展望[J].中外医学研究,2020,18(14):180-182. 被引量：15
9秦中朋,江始源.基于脂质代谢异常机制探讨大柴胡汤治疗Ⅱ型糖尿病的研究进展[J].中国中医药现代远程教育,2020,18(14):156-159. 被引量：1
10杨平,黎芳靖,罗嫚,林纬,袁高庆,黎起秦.在小檗碱作用下水稻细菌性条斑病菌生理及转录反应分析[J].植物保护学报,2020,47(5):1005-1018. 被引量：3

1原建伟,郭新玲.开源CAD软件研究[J].洛阳理工学院学报（自然科学版）,2010,20(1):32-34. 被引量：6
2杨兴平.DM2，Windows的贴身伴侣[J].大众软件,2006(16):72-73.
3向涵.基于Linux平台PCI设备驱动程序设计[J].电脑知识与技术,2011,7(3X):2078-2080.
4吕景泉.PLC控制系统中首发故障的程序检测[J].电气自动化,1999,21(1):48-48.
5姜瑜.缓冲区溢出攻击技术研究[J].电脑知识与技术（技术论坛）,2005(11):13-15.
6管瑞,髙敬阳.Machine-learning-aided precise prediction of deletions with next-generation sequencing[J].Journal of Central South University,2016,23(12):3239-3247.
7HONG HuiXiao,ZHANG WenQian,SHEN Jie,SU ZhenQiang,NING BaiTang,HAN Tao,PERKINS Roger,SHI LeMing,TONG WeiDa.Erratum to:Critical role of bioinformatics in translating huge amounts of next-generation sequencing data into personalized medicine[J].Science China(Life Sciences),2013,56(3).
8文琪,彭宏,徐志根.基于粗糙集和贝叶斯分类器的病毒程序检测[J].西南交通大学学报,2005,40(5):659-662. 被引量：2
9Windows8三大隐藏功能曝光[J].计算机与网络,2011,37(14):23-23.
10何波玲,李玲.信息流和数据流分析在程序检测中的应用[J].长春邮电学院学报,1999,17(2):61-65. 被引量：1

Genomics, Proteomics & Bioinformatics

2011年第6期

浏览历史

内容加载中请稍等...

BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data 被引量：4

参考文献17

同被引文献23

引证文献4

二级引证文献17

相关作者

相关机构

相关主题

浏览历史