期刊文献+

Analysis and prediction of exon, intron, intergenic region and splice sites for A. thaliana and C. elegans genomes

Analysis and prediction of exon, intron, intergenic region and splice sites for A. thaliana and C. elegans genomes
下载PDF
导出
摘要 Although a great deal of research has been undertaken in the area of the annotation of gene structure, predictive techniques are still not fully developed. In this paper, based on the characteristics of base composition of sequences and conservative of nucleotides at exon/intron splicing site, a least increment of diversity al-gorithm (LIDA) is developed for studying and predicting three kinds of coding exons, introns and intergenic regions. At first, by selecting the 64 trinucleotides composition and 120 position parameters of the four bases as informational parameters, coding exon, intron and intergenic sequence are predicted. The results show that overall predicted accuracies are 91.1% and 88.4%, respectively for A. thaliana and C. ele-gans genome. Subsequently, based on the po-sition frequencies of four kinds of bases in regions near intron/coding exon boundary, initia-tion and termination site of translation, 12 position parameters are selected as diversity source. And three kinds of the coding exons are predicted by use of the LIDA. The predicted successful rates are higher than 80%. These results can be used in sequence annotation. Although a great deal of research has been undertaken in the area of the annotation of gene structure, predictive techniques are still not fully developed. In this paper, based on the characteristics of base composition of sequences and conservative of nucleotides at exon/intron splicing site, a least increment of diversity al-gorithm (LIDA) is developed for studying and predicting three kinds of coding exons, introns and intergenic regions. At first, by selecting the 64 trinucleotides composition and 120 position parameters of the four bases as informational parameters, coding exon, intron and intergenic sequence are predicted. The results show that overall predicted accuracies are 91.1% and 88.4%, respectively for A. thaliana and C. ele-gans genome. Subsequently, based on the po-sition frequencies of four kinds of bases in regions near intron/coding exon boundary, initia-tion and termination site of translation, 12 position parameters are selected as diversity source. And three kinds of the coding exons are predicted by use of the LIDA. The predicted successful rates are higher than 80%. These results can be used in sequence annotation.
机构地区 不详
出处 《Journal of Biomedical Science and Engineering》 2009年第6期367-373,共7页 生物医学工程(英文)
关键词 EXON INTRON INTERGENIC Region SPLICE Site Increment of Diversity Exon Intron Intergenic Region Splice Site Increment of Diversity
  • 相关文献

参考文献1

二级参考文献30

  • 1Xie X H, Lu J, Kulbokas E J, et al. Systematic discovery of regulatory motifs in humanpromoters and 3′UTRs by comparison of several mammals. Nature, 2005, 434 (7031): 338~345
  • 2Laxton R R. The measure of diversity. J Theor Biol, 1978, 71(1):51~67
  • 3McLachlan G J. Discriminant Analysis and Statistical Pattern Recognition. New York:Wiley, 1992. 1~526
  • 4Zhang M Q. Identification of protein coding regions in the human genome by quadraticdiscriminant analysis. Proc Natl Acad Sci USA, 1997, 94 (2): 565~568
  • 5Zhang L R, Luo L F. Splice site prediction with quadratic discriminant analysis usingdiversity measure. Nucleic Acids Research, 2003, 31(21): 6214~6220
  • 6Schmid C D, Praz V, Delorenzi M, et al. The eukaryotic promoter database EPD: theimpact of in silico primer extension. Nucleic Acids Research, 2004, 32:D82~85
  • 7Matthias S, Andreas K, Kornelie F, et al. First pass annotation of promoters on humanchromosome 22. Genome Res, 2001, 11 (3):333~340
  • 8Luo L F, Li H, Zhang L R. ORF organization and gene recognition in the yeast genome.Comp Funct Genomics, 2003, 4 (3): 318~328
  • 9Suzuki Y, Yamashita R, Sugano S, et al. DBTSS, DataBase of transcriptional startsites: progress report 2004. Nucleic Acids Research, 2004, 32:D78~D81
  • 10Suzuki Y, Taira H, Tsunoda T, et al. Diverse transcriptional initiation revealed byfine, large-scale mapping of mRNA start sites.EMBO Rep, 2001, 2 (5): 388~393

共引文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部