期刊文献+

人类基因组中蛋白质编码序列数目随其长度的分布研究 被引量:1

Study on the Distribution of Protein Coding Sequences' Number with Its Length in the Human Genome
下载PDF
导出
摘要 分析了人类24条染色体基因组中蛋白质编码序列的数目随长度的分布,发现分布规律有明显的相似性;用Г(α,β)分布对实际分布进行拟合,其特征参数α均小于1,即蛋白质编码序列是呈随长度减少而其数目一直增加的分布.而研究的其它生物(15种真细菌,10种古核菌和5种真核生物)均是α>1的Г(α,β)分布.经过分析比较,推测人类蛋白质编码序列的分布也应该是α>1的Г(α,β)分布.在对短序列补充了推测数据后,重新对数据拟合,效果很好,α值在1.19~1.85之间.生物基因所遵从的Г(α,β)分布规律对研究基因组进化及评估理论预测的基因准确性具有积极意义. The distributions of protein coding sequences' number with its length in 24 chromosomes of the human genome were analyzed. The results showed that their distributions had similar form. By use of the Γ(α, β) distribution fitting to the real distributions, the values of its parameter α were all smaller than 1. That is to say, the number of protein coding sequences increased all the time with its length' s decrease. But the α values in other organisms (15 bacteria,10 archaea, and 5 eukaryotes) were all larger than 1. Compared with these results, we argue that the gene distribution in the human genome was also the Γ( α , β) distribution with α〉1. By complementing some available data to the short sequences and fitting to the new date by the Γ( α, β) distribution, a good fitting result was obtained and the values of the parameter α were between 1.19 and 1.85. The Γ( α ,β ) distribution abided by genes has constructive significance for studying human genome evolution and evaluating the reliability of genes identified by theoretical methods.
作者 冯立芹 李宏
出处 《内蒙古民族大学学报(自然科学版)》 2009年第1期58-64,共7页 Journal of Inner Mongolia Minzu University:Natural Sciences
关键词 人类基因组 蛋白质编码序列 Г(α β)分布 染色体进化速率 Human genome Protein coding sequence Γ( α, β) distribution Evolution rate ofchromosome
  • 相关文献

参考文献15

  • 1Clins F S, Patrinos A, Jorsan E, et al. New goals for the U.S. Human Genome Project: 1998 - 2003 [J]. Science, 1998,282(5389) : 682-689.
  • 2Lander E S, Linton L M, Birren B, et al. Initial sequencing and analysis of the human genome [J]. Nature, 2001,409: 860-921.
  • 3Venter J C, Adams M C, Myers E W et al. The sequence of the human genome[J]. Science, 2001, 291 : 1304-1351.
  • 4Burge C. And Karlin S. Prediction of complete gene structures in human genomic DNA[J].J Mol Boil, 1997, 268(1):78-94.
  • 5Snyder E E.And Storrno G D. Identification of protein coding regions in genomic DNA[J]. J Mol Boil, 1995,248(1)1-18.
  • 6Ahschul S F, Madden T, Schaffer A, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database research programs[J]. Nucleic Acids Res., 1997, 25(17):3389-3402.
  • 7Mayor C, Brudno M, Sequence J R, et al. VISTA: Visualizing global DNA sequence alignments of arbitrary length [J]. Bioinformatics, 2000, 16 : 1046-1 047.
  • 8Parra G, Agarwal P, Abril J F A, et al. Comparative gene prediction in human and mouse [J]. Genome Research,2003,13 (1): 108-117.
  • 9Ewing B, And Green P.Analysis of expressed sequence tags indicates 35000 human genes [J]. Nat Genet, 2000,25(2): 232-234.
  • 10Collins J E, Goward M E, Cole C G, et al. Reevaluating human gene annotation: A second-generation analysis ofchromosome 22[J].Genome Research, 2003, 13(1):27-36.

二级参考文献22

  • 1[6]Bernardi G. Compositional constraints and genome evolution.J Mol Evol, 1986,24(1):1~11
  • 2[7]Li W, Fang W, Ling L, Wang JH, Xuan ZY, Chen RS.Phylogeny based on whole genome as inferred from complete information set analysis. J Biol Phys, 2002,28(4):439~447
  • 3[8]Wang JH, Fang WW, Ling IJ, Chen RS. Genes functional arrangement as a measure of the phylogenetic relationships of microorganisms. J Biol Phys, 2002,28(1):55~62
  • 4[9]Ling LJ, Wang JH, Cui Y, Li W, Chen RS. Proteome-wide analysis of protein function composition reveals the clustering and phylogenetic properties of organisms. Molecular Phylogenetics and Evolution, 2002,25(1):101~111
  • 5[14]Zhang CT, Wang J. Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. Nucleic Acids Res, 2000,28(14):2804~2814
  • 6[15]Luo LF, Li H, Zhang LR. ORF organization and gene recognition in the yeast genome. Comp Funct Genom, 2003,4:318~328
  • 7[17]Blandin G, Durrens P, Tekaia F, Aigle M, Bolotin-Fukuhara M, Bon E, Casaregola S, Montigny J, Gaillardin C, Lepingle A, Llorente B, Malpertuy A, Neuveglise C,Ozier-Kalogeropoulos O, Perrin A, Potier S, Souciet JL,Talla E, Toffano-Nioche C, Wesolowski-Louvel M, Marck C,Dujon B. Genomic exploration of the hemiascomycetous yeasts: 4. The genome of Saccharomyces cerevisiae revisited.FEBS Lett, 2000,487(1):31~36
  • 8[18]Mackiewicz P, Kowalczuk M, Gierlik A, Dudek MR, Cebrat S. Origin and properties of non-coding sequences in the yeast genome. Nucleic Acid Res, 1999,27(17):3503~3509
  • 9[19]Gooffeau A, Barrell BG, Bussey H, Davis RW, Dujon B,Feldmann H, Galibert F, Hoheisel JD, Jacq C, Johnston M,Louis EJ, Mewes HW, Murakami Y, Philippsen P, Tettelin H, Oliver SG. Life with 6000 genes. Science, 1996,274(5287):546~567
  • 10[1]Luo LF, Trainor LEH. A stochastic evolutionary model of molecular sequences. J Theo Biol, 1992,157(1):83~94

共引文献15

同被引文献3

  • 1冯立芹 ,李宏 .基因组中开阅读框架长度的分布模型与基因组进化[J].生物物理学报,2004,20(5):375-381. 被引量:5
  • 2Marie Skovgaard,Lars Juhl Jensen,S?ren Brunak,David Ussery,Anders Krogh.On the total number of genes and their length distribution in complete microbial genomes[J].Trends in Genetics.2001(8)
  • 3Jianzhi Zhang.Protein-length distributions for the three domains of life[J].Trends in Genetics.2000(3)

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部