人类基因组中蛋白质编码序列数目随其长度的分布研究被引量：1

Study on the Distribution of Protein Coding Sequences' Number with Its Length in the Human Genome

下载PDF

导出

摘要分析了人类24条染色体基因组中蛋白质编码序列的数目随长度的分布,发现分布规律有明显的相似性;用Г(α,β)分布对实际分布进行拟合,其特征参数α均小于1,即蛋白质编码序列是呈随长度减少而其数目一直增加的分布.而研究的其它生物(15种真细菌,10种古核菌和5种真核生物)均是α>1的Г(α,β)分布.经过分析比较,推测人类蛋白质编码序列的分布也应该是α>1的Г(α,β)分布.在对短序列补充了推测数据后,重新对数据拟合,效果很好,α值在1.19～1.85之间.生物基因所遵从的Г(α,β)分布规律对研究基因组进化及评估理论预测的基因准确性具有积极意义. The distributions of protein coding sequences＇ number with its length in 24 chromosomes of the human genome were analyzed. The results showed that their distributions had similar form. By use of the Γ（α, β） distribution fitting to the real distributions, the values of its parameter α were all smaller than 1. That is to say, the number of protein coding sequences increased all the time with its length＇ s decrease. But the α values in other organisms （15 bacteria,10 archaea, and 5 eukaryotes） were all larger than 1. Compared with these results, we argue that the gene distribution in the human genome was also the Γ（ α , β） distribution with α〉1. By complementing some available data to the short sequences and fitting to the new date by the Γ（ α, β） distribution, a good fitting result was obtained and the values of the parameter α were between 1.19 and 1.85. The Γ（ α ,β ） distribution abided by genes has constructive significance for studying human genome evolution and evaluating the reliability of genes identified by theoretical methods.

作者冯立芹李宏

机构地区内蒙古民族大学物理与电子信息学院内蒙古大学物理系

出处《内蒙古民族大学学报（自然科学版）》 2009年第1期58-64,共7页 Journal of Inner Mongolia Minzu University：Natural Sciences

关键词人类基因组蛋白质编码序列 Г(α β)分布染色体进化速率 Human genome Protein coding sequence Γ（ α, β） distribution Evolution rate ofchromosome

分类号 Q7 [生物学—分子生物学]

引文网络
相关文献

参考文献15

1Clins F S, Patrinos A, Jorsan E, et al. New goals for the U.S. Human Genome Project: 1998 - 2003 [J]. Science, 1998,282(5389) : 682-689.
2Lander E S, Linton L M, Birren B, et al. Initial sequencing and analysis of the human genome [J]. Nature, 2001,409: 860-921.
3Venter J C, Adams M C, Myers E W et al. The sequence of the human genome[J]. Science, 2001, 291 : 1304-1351.
4Burge C. And Karlin S. Prediction of complete gene structures in human genomic DNA[J].J Mol Boil, 1997, 268(1):78-94.
5Snyder E E.And Storrno G D. Identification of protein coding regions in genomic DNA[J]. J Mol Boil, 1995,248(1)1-18.
6Ahschul S F, Madden T, Schaffer A, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database research programs[J]. Nucleic Acids Res., 1997, 25(17):3389-3402.
7Mayor C, Brudno M, Sequence J R, et al. VISTA: Visualizing global DNA sequence alignments of arbitrary length [J]. Bioinformatics, 2000, 16 : 1046-1 047.
8Parra G, Agarwal P, Abril J F A, et al. Comparative gene prediction in human and mouse [J]. Genome Research,2003,13 (1): 108-117.
9Ewing B, And Green P.Analysis of expressed sequence tags indicates 35000 human genes [J]. Nat Genet, 2000,25(2): 232-234.
10Collins J E, Goward M E, Cole C G, et al. Reevaluating human gene annotation: A second-generation analysis ofchromosome 22[J].Genome Research, 2003, 13(1):27-36.

二级参考文献22

1[6]Bernardi G. Compositional constraints and genome evolution.J Mol Evol, 1986,24(1):1～11
2[7]Li W, Fang W, Ling L, Wang JH, Xuan ZY, Chen RS.Phylogeny based on whole genome as inferred from complete information set analysis. J Biol Phys, 2002,28(4):439～447
3[8]Wang JH, Fang WW, Ling IJ, Chen RS. Genes functional arrangement as a measure of the phylogenetic relationships of microorganisms. J Biol Phys, 2002,28(1):55～62
4[9]Ling LJ, Wang JH, Cui Y, Li W, Chen RS. Proteome-wide analysis of protein function composition reveals the clustering and phylogenetic properties of organisms. Molecular Phylogenetics and Evolution, 2002,25(1):101～111
5[14]Zhang CT, Wang J. Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. Nucleic Acids Res, 2000,28(14):2804～2814
6[15]Luo LF, Li H, Zhang LR. ORF organization and gene recognition in the yeast genome. Comp Funct Genom, 2003,4:318～328
7[17]Blandin G, Durrens P, Tekaia F, Aigle M, Bolotin-Fukuhara M, Bon E, Casaregola S, Montigny J, Gaillardin C, Lepingle A, Llorente B, Malpertuy A, Neuveglise C,Ozier-Kalogeropoulos O, Perrin A, Potier S, Souciet JL,Talla E, Toffano-Nioche C, Wesolowski-Louvel M, Marck C,Dujon B. Genomic exploration of the hemiascomycetous yeasts: 4. The genome of Saccharomyces cerevisiae revisited.FEBS Lett, 2000,487(1):31～36
8[18]Mackiewicz P, Kowalczuk M, Gierlik A, Dudek MR, Cebrat S. Origin and properties of non-coding sequences in the yeast genome. Nucleic Acid Res, 1999,27(17):3503～3509
9[19]Gooffeau A, Barrell BG, Bussey H, Davis RW, Dujon B,Feldmann H, Galibert F, Hoheisel JD, Jacq C, Johnston M,Louis EJ, Mewes HW, Murakami Y, Philippsen P, Tettelin H, Oliver SG. Life with 6000 genes. Science, 1996,274(5287):546～567
10[1]Luo LF, Trainor LEH. A stochastic evolutionary model of molecular sequences. J Theo Biol, 1992,157(1):83～94

共引文献15

1冯立芹 ,李宏 .基因组中开阅读框架长度的分布模型与基因组进化[J].生物物理学报,2004,20(5):375-381. 被引量：5
2杜德成,解廷月,李宏.线虫DNA序列沿染色体进化的非均匀性[J].内蒙古大学学报（自然科学版）,2006,37(5):516-523.
3王芳平,李宏.几种模式生物密码对的使用和基因组进化[J].内蒙古大学学报（自然科学版）,2007,38(2):166-172. 被引量：4
4王芳平,李宏.密码对的使用与基因组进化[J].生物物理学报,2007,23(3):176-184. 被引量：8
5黄云鹏,夏晓峰,柯崇榕,黄钦耿,江贤章,田宝玉,黄建忠.Toxoplasma gondii Mei2序列的电子延伸与分析[J].生物信息学,2008,6(3):102-105.
6王小龙,李宏.酵母螺旋酶(YRF)基因的结构域及其基因家族的演化[J].内蒙古大学学报（自然科学版）,2002,33(4):416-422. 被引量：2
7李宏,罗辽复.随机DNA序列中ORF的分布[J].内蒙古大学学报（自然科学版）,2002,33(5):515-519.
8李宏.酵母、大肠杆菌和枯草杆菌基因组中短ORF的分布与形成原因[J].生物物理学报,2002,18(3):307-312. 被引量：5
9王芳平,王志坚,李永香.果蝇基因组中内含子数目随其长度的分布研究[J].基因组学与应用生物学,2020,39(3):1062-1066. 被引量：2
10胡秀珍,李宏,吕军.需氧恶性杆菌和海栖热袍菌基因组中第一ATG规则的检验[J].内蒙古大学学报（自然科学版）,2002,33(6):622-626. 被引量：2

同被引文献3

1冯立芹 ,李宏 .基因组中开阅读框架长度的分布模型与基因组进化[J].生物物理学报,2004,20(5):375-381. 被引量：5
2Marie Skovgaard,Lars Juhl Jensen,S?ren Brunak,David Ussery,Anders Krogh.On the total number of genes and their length distribution in complete microbial genomes[J].Trends in Genetics.2001(8)
3Jianzhi Zhang.Protein-length distributions for the three domains of life[J].Trends in Genetics.2000(3)

引证文献1

1权燕敏.基因组中基因长度分布研究进展[J].生物技术世界,2014,11(9):7-7. 被引量：1

二级引证文献1

1王铎,孙春玉,陈静,王义.真核生物α-甘露糖苷酶生物信息学分析[J].生命科学研究,2018,22(3):173-183. 被引量：1

1张帅,许林.微RNA与免疫系统[J].生命的化学,2010,30(3):405-408.
2廖雅静（编译）.前途无量[J].生物技术世界,2009(4):17-18.
3屈满义,查向东,王钰,杨金环,蒋琳.应用寡核苷酸探针检测甘薯基因组中NBS与LRR序列数目[J].生物学杂志,2009,26(2):11-14.
4杨齐衡,李林.酵母双杂交技术及其在蛋白质组研究中的应用[J].生物化学与生物物理学报,1999,31(3):221-225. 被引量：24
5新加坡用烟叶培育出人类蛋白质[J].中外科技信息,2003(8):63-63.
6李炜疆,宋江宁.EcoPDB:高精度大肠杆菌蛋白质结构与对应基因序列数据集[J].无锡轻工大学学报（食品与生物技术）,2001,20(4):340-343. 被引量：5
7周蓬蓬,秦文敏,余龙江,李家麟.表面活性剂对被孢霉产花生四烯酸的影响[J].华中科技大学学报（自然科学版）,2003,31(5):98-100. 被引量：3
8曹单锋,胡智强.转座元件在不同人类基因组数据库中分布情况的研究[J].基因组学与应用生物学,2015,34(6):1144-1148.
9Gabajova Blanka,Valkova Danka,Bohac Andrej,Kovacova Elena,Moravcik Roman,Zeman Michal.hVEGF165 Expression in Escherichia coli Conserves Its Biological Function[J].Journal of Chemistry and Chemical Engineering,2012,6(8):738-743.
10栾生,孔杰,王清印,高焕,王伟继,张庆文.日本囊对虾基因组小卫星的特征分析[J].水产学报,2007,31(2):137-144. 被引量：1

内蒙古民族大学学报（自然科学版）

2009年第1期

浏览历史

内容加载中请稍等...

人类基因组中蛋白质编码序列数目随其长度的分布研究被引量：1

参考文献15

二级参考文献22

共引文献15

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

人类基因组中蛋白质编码序列数目随其长度的分布研究 被引量：1

参考文献15

二级参考文献22

共引文献15

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

人类基因组中蛋白质编码序列数目随其长度的分布研究被引量：1