期刊文献+

脊椎动物基因注释中的大基因问题 被引量:1

Vertebrate Gene Predictions and the Problem of Large Genes
下载PDF
导出
摘要 为了找出编码蛋白质的基因,注释流程结合了“从头开始的基因预测方法”和“与已知基因相似性比较”这两种方法。“从头开始的基因预测方法”虽然有很高的假阳性但是假阴性却很低;相形之下,结合了相似性比对的方法之后虽然能够降低假阳性,但是却大大提高了假阴性。我们发现,在这当中与基因预测正确率相关的最重要因素就是基因大小(包括内含子在内)——大基因尤其容易产生预测错误。 To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistently low false-negative rate. The incorporation of similarity information is meant to reduce the false-positive rate, but in doing so it increases the false-negative rate. The crucial variable is gene size {including introns) -genes of the most extreme sizes, especially very large genes, are most likely to be incorrectly predicted.
出处 《世界科技研究与发展》 CSCD 2003年第6期42-50,共9页 World Sci-Tech R&D
关键词 脊椎动物 基因注释 编码蛋白质 假阳性 假阴性 基因预测 大基因 gene prediction , false positive, false negative, large gene
  • 相关文献

参考文献39

  • 1Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature, 2002, 420(6915): 520~562
  • 2Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature, 2002,420(6915): 563~573
  • 3Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature, 2001, 409(6822): 860-921
  • 4Aparieio, S. et al. Whole-genome shotgun assembly and analysis of the ge nome of Fugu rubripes. Science, 2002, 297(5585): 1301 ~1310
  • 5Misra, S. et al. Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol, 2002, 3(12):RESEARCH0083
  • 6Reboul, J. et al. C. elegans ORFeome version 1. 1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat Genet, 2003, 34( 1 ): 35 ~41
  • 7Stein, L. Genone annotation: from sequence to biology. Nat Rev Genet, 2001, 2(7): 493-503
  • 8Zhang, M. Q. Computational prediction of eukaryotic protein-coding genes. Nat Rev Genet, 2002, 3(9): 698~709
  • 9Kent, W. J. et al. The human genome browser at UCSC. Genome Res, 2002, 12(6): 996-1006
  • 10Kent, W. J. BLAT-the BLAST-like alignment tool. Genome Res,2002, 12(4): 656-664

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部