摘要
基因组序列的8-mer频谱具有物种特异性,解读8-mer频谱内在规律,对于揭示基因组序列的结构组成和进化模式具有重要的意义。本研究统计了66个物种的8-mer频谱分布,发现高等哺乳动物8-mer频谱分布以三峰为主,鸟类和爬行类动物频谱分布以双峰为主,而鱼类和非脊椎类动物频谱分布以单峰为主。为了进一步研究基因组8-mer频谱的构成,使用16种XY二核苷分类方法。研究结果表明,只有在CG分类下具有以下两个特征:(1)CG0、CG1和CG2子集的8-mer频谱呈现单峰分布,并且3个峰彼此分离;(2)相对随机中心位置,CG1和CG2子集频谱分布远离随机中心,CG0子集频谱分布在随机中心周围。为了进一步验证CG0、CG1和CG2子集频谱分布与物种进化的关系,使用3个CG子集频谱的分离性指标构建了66个物种的系统发育树,该系统发育树将物种分为4个簇,分别为高等哺乳类、鸟类与爬行类、鱼类和非脊椎类。研究结果表明3个CG子集频谱分布与物种基因组进化信息密切相关。
The 8-mer spectrum of the genome sequence is species-specific,and it is of great significance to unscramble the internal law of the 8-mer spectrum to reveal the sequence composition rule and evolutionary model of the genome.The 8-mer spectrum distribution of 66 species was analyzed and it was found that the 8-mer spectrum distribution of higher mammals was mainly three peaks,the spectrum distribution of birds and reptiles was mainly two peaks,while the spectrum distribution of fish and invertebrates was mainly one peak.To further investigate the makeup of the genomic 8-mer spectrum,16 XY dinucleotide classification methods were used.The results showed that only the CG classification had the following two characteristics:(1)The 8-mer spectrum of CG0,CG1,and CG2 subsets presented an unimodal distribution,and the three peaks were separated from each other;(2)Relative to the random center location,the spectrum distribution of CG1 and CG2 subsets was far away from the random center,and the spectrum of CG0 subsets was distributed around the random center.To further verify the relationship between the spectrum distribution of CG0,CG1,and CG2 subsets and species evolution,a phylogenetic tree of 66 species was constructed using the separability of the spectrum of three CG subsets.The phylogenetic tree divided species into four clusters,namely higher mammals,birds and reptiles,fish,and invertebrates.The results show that the spectrum distribution of the three CG subsets is closely related to the information on species genome evolution.
作者
杨振华
王丽
郑燕
YANG Zhenhua;WANG Li;ZHENG Yan(School of Economics and Management,Inner Mongolia University of Science&Technology,Baotou,014010;Faculty of Chemistry,Baotou Teachers'College,Baotou,014030;School of Medical Technology and Anaesthesia,Baotou Medical College,Baotou,014040)
出处
《基因组学与应用生物学》
CAS
CSCD
北大核心
2024年第2期228-240,共13页
Genomics and Applied Biology
基金
内蒙古自然科学基金项目(2018BS03021)
内蒙古自治区高等学校科学技术研究项目(NJZY23086)共同资助。