期刊文献+

大规模GO注释的生物信息学流程 被引量:8

Bioinformatics Procedure of Large-scale GO Annotation
下载PDF
导出
摘要 随着新一代测序技术的不断发展,海量的序列数据将为生物学研究者挖掘基因信息提供巨大的资源.信息挖掘的一项重要工作是对序列进行功能注释,其中最重要的功能注释方式是基因本体论(Gene Ontology,GO)的注释.利用生物信息学方法和软件工具集成了针对EST序列的大规模GO注释流程(large-scale GO annotation pipeline,LSGAP).该流程集合了BLAST、B2g4pipe以及Wego等软件和Swissprot、Nr或Interpro等常用蛋白数据库.用户可以将EST序列通过此流程最终获得可视化的GO分类统计图表,直观地显示基因在不同过程中的参与情况.为了验证LSGAP的准确性,对2007年发表的美洲牡蛎(Crassostrea Virginica)的EST序列进行了LSGAP分析,结果表明GO分析非常准确有效.通过与Blast2go和GoBlast等GO注释软件进行比较,LSGAP流程具有可以本地化运行BLAST、对硬件要求低和运行时间短等诸多优势,因此LSGAP流程是科研人员进行基因功能挖掘的有效工具. With the fast development of next-generation sequencing technologies,a large number of biological data will provide tre- mendous sequence resources to biologists in gene exploitation. An important task on data mining is to annotate genes with functions, and the most important method is Gene Ontology (GO) annotation. This research formed the procedure of large-scale GO annotation pipeline for EST sequences,utilizing bioinformatics methodologies and software tools. This procedure encompasses different software like BLAST, B2g4pipe and Wego, together with Swissprot, Interpro or Nr protein databases. Users can put EST sequences with FASTA format through this system and ultimately gain visualized GO distribution statistics diagrams, which demonstrate the situations of the genes involved in different processes. In order to test and verify the preciseness of LSGAP, the EST sequences of eastern oyster published in 2007 were gone through this pipeline,and the results demonstrated that LSGAP procedure was quite accurate and efficient. Compared with other GO annotation software such as Blast2go (Graphical User Interface) and GoBlast,LSGAP procedure has many advantages:running BLAST software locally, without downloading many GO relative databases and consuming less time. All of the results demonstrated that LSGAP is an efficient tool for researchers to do data mining.
出处 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2012年第1期139-143,共5页 Journal of Xiamen University:Natural Science
基金 国家重点基础研究发展计划(973)项目(2010CB126403) 国家自然科学基金项目(40976093) 福建省青年科技人才创新项目(2008F3098)
关键词 LSGAP GO注释 基因功能 生物信息学 LSGAP GO annotation gene function bioinformatics
  • 相关文献

参考文献11

  • 1Harris M A,Clark J,Ireland A,et al. The Gene Ontology (GO) database and informatics resources [J]. Nucleic Acid Res, 2004,32 (Database Issue): 258-261.
  • 2Beissbarth T,Speed T P. GOstat: find statistically over-represented Gene Ontologies within a group of genes[J]. Bioinformatics, 2004,20 (9) :1464-1465.
  • 3王成刚,莫志宏.整合BLAST搜索与GO注释的软件GoBlast[J].中国生物化学与分子生物学报,2006,22(12):1003-1006. 被引量:3
  • 4Ana C, Stefan G. Blast2go : a universal tool for annotation, visualization and analysis in functional genomics research [J]. Bioinformatics, 2005,21: 3674-3676.
  • 5Ralf S, Mark L. Annoter: GO, EC and KEGG annotation of EST databases [J].BMC Bioinformatics,2008,9:180.
  • 6Altschul S F, Gish W, Miller W, et al. Basic local alignment search tool [J]. J Mol Biol, 1990,215:403-410.
  • 7Ye J, Fang L, Zheng H, et al. WEGO: a web tool for plot ting GO annotations [J]. Nucleic Acids Res, 2006,34: 293-297.
  • 8Bethesda M D. FASTA format description [DB/OL]. [2005-03-22]. http://blast. wustl. edu/.
  • 9Jason E S,David B,Kris B,et al. The bioperl toolkit:perl modules for life sciences [J]. Genome Res, 2001,12 ( 10 ): 1611-1618.
  • 10Jonas Q,Shaolin W,Ping L,et al. Generation and analysis of ESTs from the eastern oyster, Crassostrea virginica Gmelin and identification of microsatellite and SNP markers [J]. BMC Genomics ,2007,8 : 157.

二级参考文献10

  • 1Harris M A,Clark J,Ireland A,et al.The Gene Ontology (GO)database and informatics resource[J].Nucleic Acids Res,2004,32(Database issue):258-261
  • 2Altschul S F,Madden T L,Schaffer A A,et al.Gapped BLAST and PSI-BLAST:a new generation of protein database search programs[J].Nucleic Acids Res,1997,25(17):3389-3402
  • 3Gene ontology:tool for the unification of biology.Stanford,CA:Stanford University,2002[2005-12-4].http://www.geneontology.org
  • 4Debian:the Debian GNU/Linux Operating System.Indianapolis,IN:The Debian Project,2006[2006-4-9].http://www.debian.org
  • 5Apache:The Apache HTTP Server.Los Angeles,CA:The Apache Software Foundation,2005[2005-11-9].http://www.apache.org
  • 6MYSQL:The open source SQL Database Server.Seattle,WA:MySQL Inc.,2001[2005-4-13].http://www.mysql.com
  • 7FASTA format description.Bethesda,MD:National Center for Biotechnology Information,2001[2005-12-4].http://www.ncbi.nlm.nih.gov/blast/fasta.shtml
  • 8WU-BLAST:Washington University BLAST Server.Saint Louis,Missouri:Washington University,2005[2005-3-22].http://blast.wustl.edu/.
  • 9Jason E S,David B,Kris B,et al.The Bioperl Toolkit:Perl modules for the life sciences[J].Genome Res,2002,12 (10):1611-1618
  • 10ACHA BRARE:Acetylcholine receptor protein subunit alpha precursor.Washington,DC:National Biomedical Research Foundation at the Georgetown UniversityMedicalCenter,2006[2006-7-4].http://www.ebi.uniprot.org/entry/ACHA _ BRARE

共引文献2

同被引文献104

引证文献8

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部