期刊文献+

基因组中开阅读框架长度的分布模型与基因组进化 被引量:5

THE DISTRIBUTION MODEL OF OPEN READING FRAME LENGTH IN DIFFERENT GENOMES AND THE GENOME EVOLUTION
下载PDF
导出
摘要 分析了5种真核、15种细菌和10种古菌基因组中开阅读框架(openreadingframe,ORF)的数目随长度的分布,发现不同生物的分布相似且有明显的规律性。用各种分布模型进行拟合比较,结果显示每种生物的这类分布均符合Г(α,β)分布,由此提出生物基因组中ORF的数目随长度的分布是Г(α,β)分布的假设。分析各生物基因组的拟合参数,发现α和β值与基因组进化存在明显的相关性;讨论了α和β值的生物进化意义,并给出了真核生物偏好使用长基因的结论;依照Г(α, β)分布估计了酵母基因组中ORF数目的上限为5870个。该方法对于研究生物基因组进化以及评估理论预测基因的可靠性具有建设性意义。 The distributions of number of open reading frame with its length were analyzed in 5 Eukarya, 15 Bacteria and 10 Archaea genomes. The results showed that their distributions had similar form and obvious regulation. According to the characteristics of their distribution forms, we proposed a hypothesis that this kind of distribution was Г(α,β) distribution. Compared with other distribution models, the Г(α,β) distribution model is in accord with that of ORF's number with its length for all of the 30 genomes. By studying the parameter α and β values of Г(α,β) distribution, a distinct correlation between the values of α and β and the genome evolution was found. The evolution meanings of α and β were discussed and a conclusion that Eukarya had a bias towards the longer ORFs was obtained. In terms of the Г(α,β) distribution, it was estimated that the maximum number of protein coding sequences in Saccharomyces cerevisiae was approximately 5870. This theoretical method used in this paper has constructive significance for studying the genome evolution and evaluating the reliability of gene identification.
出处 《生物物理学报》 CAS CSCD 北大核心 2004年第5期375-381,共7页 Acta Biophysica Sinica
基金 国家自然科学基金项目(10147204) 内蒙古自然科学基金项目
关键词 基因组 ORF Г(α β)分布 基因组进化 Genomes, Open reading frame Г(α,β) distribution Genome evolution
  • 相关文献

参考文献16

  • 1[1]Luo LF, Trainor LEH. A stochastic evolutionary model of molecular sequences. J Theo Biol, 1992,157(1):83~94
  • 2[2]Li W. Expansion-modification systems: A model for spatial 1/f spectra. Phys Rev A, 1991,43(10):5240~5260
  • 3[3]Luo LF, Bai GY. The maximum information principle and the evolution of nucleotide sequences. J Theor Biol, 1995,174(2):131~136
  • 4[4]Luo LF, Ji FM, Li Hong. Fuzzy classification of nucleotide sequences and bacterial evolution. Bull Math Biol, 1995,57(4):527~537
  • 5[5]Hsieh LC, Luo LF, Ji FM, Lee HC. Minimal model for genome evolution and growth. Physical Review Letters, 2003,90(1):1~4
  • 6[6]Bernardi G. Compositional constraints and genome evolution.J Mol Evol, 1986,24(1):1~11
  • 7[7]Li W, Fang W, Ling L, Wang JH, Xuan ZY, Chen RS.Phylogeny based on whole genome as inferred from complete information set analysis. J Biol Phys, 2002,28(4):439~447
  • 8[8]Wang JH, Fang WW, Ling IJ, Chen RS. Genes functional arrangement as a measure of the phylogenetic relationships of microorganisms. J Biol Phys, 2002,28(1):55~62
  • 9[9]Ling LJ, Wang JH, Cui Y, Li W, Chen RS. Proteome-wide analysis of protein function composition reveals the clustering and phylogenetic properties of organisms. Molecular Phylogenetics and Evolution, 2002,25(1):101~111
  • 10李宏.酵母、大肠杆菌和枯草杆菌基因组中短ORF的分布与形成原因[J].生物物理学报,2002,18(3):307-312. 被引量:5

二级参考文献11

  • 1吴延浩.新自然史[M].北京:化学工业出版社,1993..
  • 2陈增阅.普通生物学[M].北京:高等教育出版社,1997..
  • 3F.J.戴森.全方位的无限-生命为什么如此复杂[M].北京:三联书店,1998..
  • 4刘新文 童坦君.真核生物中的基因流动现象[J].生理科学进展,1998,29(4):324-330.
  • 5戴森 F J,全方位的无限—生命为什么如此复杂,1998年
  • 6陈增阅,普通生物学,1997年
  • 7吴延浩,新自然史,1993年
  • 8刘后一,生物是怎样进化的,1982年
  • 9卢欣,孙之荣,李衍达.自组织作用在生物进化中的模拟研究[J].生物物理学报,2001,17(1):158-166. 被引量:6
  • 10李宏,罗辽复.大肠杆菌编码区碱基片段的分析研究[J].生物物理学报,2001,17(1):167-173. 被引量:5

共引文献8

同被引文献53

  • 1王树林,王戟,陈火旺,张鼎兴.k-长DNA子序列频数分布研究[J].生物物理学报,2006,22(3):177-196. 被引量:1
  • 2李利,张红平,吴登俊.南江黄羊FSHβ亚基基因内含子1序列的测定及分析[J].黑龙江畜牧兽医,2007(3):7-9. 被引量:4
  • 3Clins F S, Patrinos A, Jorsan E, et al. New goals for the U.S. Human Genome Project: 1998 - 2003 [J]. Science, 1998,282(5389) : 682-689.
  • 4Lander E S, Linton L M, Birren B, et al. Initial sequencing and analysis of the human genome [J]. Nature, 2001,409: 860-921.
  • 5Venter J C, Adams M C, Myers E W et al. The sequence of the human genome[J]. Science, 2001, 291 : 1304-1351.
  • 6Burge C. And Karlin S. Prediction of complete gene structures in human genomic DNA[J].J Mol Boil, 1997, 268(1):78-94.
  • 7Snyder E E.And Storrno G D. Identification of protein coding regions in genomic DNA[J]. J Mol Boil, 1995,248(1)1-18.
  • 8Ahschul S F, Madden T, Schaffer A, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database research programs[J]. Nucleic Acids Res., 1997, 25(17):3389-3402.
  • 9Mayor C, Brudno M, Sequence J R, et al. VISTA: Visualizing global DNA sequence alignments of arbitrary length [J]. Bioinformatics, 2000, 16 : 1046-1 047.
  • 10Parra G, Agarwal P, Abril J F A, et al. Comparative gene prediction in human and mouse [J]. Genome Research,2003,13 (1): 108-117.

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部