期刊文献+

高等生物中基因组序列8-mer频谱分布模式及其在物种进化研究中的应用

The Pattern of 8-mer Spectrum Distribution of Genome Sequences in Higher Organisms and Its Application in the Study of Species Evolution
原文传递
导出
摘要 基因组序列的8-mer频谱具有物种特异性,解读8-mer频谱内在规律,对于揭示基因组序列的结构组成和进化模式具有重要的意义。本研究统计了66个物种的8-mer频谱分布,发现高等哺乳动物8-mer频谱分布以三峰为主,鸟类和爬行类动物频谱分布以双峰为主,而鱼类和非脊椎类动物频谱分布以单峰为主。为了进一步研究基因组8-mer频谱的构成,使用16种XY二核苷分类方法。研究结果表明,只有在CG分类下具有以下两个特征:(1)CG0、CG1和CG2子集的8-mer频谱呈现单峰分布,并且3个峰彼此分离;(2)相对随机中心位置,CG1和CG2子集频谱分布远离随机中心,CG0子集频谱分布在随机中心周围。为了进一步验证CG0、CG1和CG2子集频谱分布与物种进化的关系,使用3个CG子集频谱的分离性指标构建了66个物种的系统发育树,该系统发育树将物种分为4个簇,分别为高等哺乳类、鸟类与爬行类、鱼类和非脊椎类。研究结果表明3个CG子集频谱分布与物种基因组进化信息密切相关。 The 8-mer spectrum of the genome sequence is species-specific,and it is of great significance to unscramble the internal law of the 8-mer spectrum to reveal the sequence composition rule and evolutionary model of the genome.The 8-mer spectrum distribution of 66 species was analyzed and it was found that the 8-mer spectrum distribution of higher mammals was mainly three peaks,the spectrum distribution of birds and reptiles was mainly two peaks,while the spectrum distribution of fish and invertebrates was mainly one peak.To further investigate the makeup of the genomic 8-mer spectrum,16 XY dinucleotide classification methods were used.The results showed that only the CG classification had the following two characteristics:(1)The 8-mer spectrum of CG0,CG1,and CG2 subsets presented an unimodal distribution,and the three peaks were separated from each other;(2)Relative to the random center location,the spectrum distribution of CG1 and CG2 subsets was far away from the random center,and the spectrum of CG0 subsets was distributed around the random center.To further verify the relationship between the spectrum distribution of CG0,CG1,and CG2 subsets and species evolution,a phylogenetic tree of 66 species was constructed using the separability of the spectrum of three CG subsets.The phylogenetic tree divided species into four clusters,namely higher mammals,birds and reptiles,fish,and invertebrates.The results show that the spectrum distribution of the three CG subsets is closely related to the information on species genome evolution.
作者 杨振华 王丽 郑燕 YANG Zhenhua;WANG Li;ZHENG Yan(School of Economics and Management,Inner Mongolia University of Science&Technology,Baotou,014010;Faculty of Chemistry,Baotou Teachers'College,Baotou,014030;School of Medical Technology and Anaesthesia,Baotou Medical College,Baotou,014040)
出处 《基因组学与应用生物学》 CAS CSCD 北大核心 2024年第2期228-240,共13页 Genomics and Applied Biology
基金 内蒙古自然科学基金项目(2018BS03021) 内蒙古自治区高等学校科学技术研究项目(NJZY23086)共同资助。
关键词 基因组序列 8-mer频谱 分离性 系统发育树 Genome sequence 8-mer spectrum Separability Phylogenetic tree
  • 相关文献

参考文献2

二级参考文献25

  • 1Csrs M, No L, Kucherov G. Reconsidering the significance of genomic word frequencies. Trends Genet, 2007, 23(11): 543-546.
  • 2Tuller T, Chor B, Nelson N. Forbidden penta-peptides. Protein Sci, 2007, 16(10): 2251-2259.
  • 3Hao B, Lee HC, Zhang S. Fractals related to long DNA sequences and complete genomes. Chaos, Soliton Fract, 2000, 11(6): 825-836.
  • 4Subirana JA, Messeguer X. The most frequent short sequences in non-coding DNA. Nucleic Acids Res, 2010, 38(4): 1172-1181.
  • 5Hampikian G, Andersen T. Absent sequences: Nullomers and primes. Pac Syrup Biocomput, 2007, 12:355-366.
  • 6Hariharan R, Simon R, PJllai MR, Taylor TD. Comparative analysis of DNA word abundances in four yeast genomes using a novel statistical background mode. PLoS One, 2013, 8(3): e58038.
  • 7Yu HJ. Segmented k-mer and its application on similadty analysis of mitochonddal genome sequences. Gene, 2013, 518:419-424.
  • 8Chae H, Park J, Lee SW, Nephew KP, Kim S. Comparative analysis using k-mer and k-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes. Nucleic Acids Res, 2013, 41 (9): 4783-4791.
  • 9Youngik Y, Kenneth N, Sun K. A novel k-mer mixture logistic regression for methylation susceptibility modeling of CpG dinucleotides in human gene promoters. BMC Bioinforrnatics, 2012, 13(Suppl 3): $15.
  • 10Rayan C, Paul M. Informed and automated k-mer size selection for genome assembly. Bioinformatics, 2013, 30(1): 31 -37.

共引文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部