摘要
构建系统发生树时,其拓扑结构会在不同的基因组区域产生不一致性。对此问题,贝叶斯一致性分析法(BCA)可在全基因组规模上进行系统发生树分析,并进而对不一致性信息进行量化统计。采用此方法对由C3H/Hu小鼠(Mus musculus)和129Sv小鼠回交多代产生的129S1小鼠进行系统发生树分析,输入相应的一组序列文件,用若干生物信息学软件(如VCFtools,Repeat Masker,PAUP*4.0,Mr Model Test,Mr Bayes等)对其进行屏蔽重复序列、序列比对等处理,辅以Perl语言脚本,最终得到全基因组范围不同区段系统发生树不一致信息。在小鼠10号染色体的所有99个基因座中,支持129S1和129Sv品系小鼠为姐妹关系的拓扑结构占了84.7%(后验概率最高),这证明了C3H/Hu小鼠对129S1小鼠基因组的贡献程度较小。结果表明,贝叶斯一致性分析法有助于基因组不同区段进化历史的研究。
During the construction of phylogenetic trees, there might be discordance of topologies in different genome regions. In order to address this issue, Bayesian concordance analysis(BCA) can be utilized to perform a phylogenetic analysis across whole genome and statistically quantification of the discordance. In this article, BCA was used to analyze the phylogenetic history of the strain of 129S1(Mus musculus) which is originated from the backcross offspring of several generations between C3H/Hu strain and 129/Sv strain. Supplemented by Perl scripts, our pipeline took the genome sequence files as input and calls several bioinformatics software(e.g. VCFtools, Repeat Masker, PAUP*4.0, Mr Model Test, Mr Bayes and so on) to mask their repeat sequences, align sequences and so on. Then we obtained the phylogenetic discordance information of various locus across whole genome. Among all 99 loci in chromosome 10, 87.4% of loci were supported with a single topology of 129S1/129P2(higher posterior probability), which is consistent with the hypothesis that the C3H/Hu mouse makes less contribution to the 129S1 genome. Our results indicate that BCA benefits the further studies on evolutionary histories of different genome regions.
出处
《动物学杂志》
CAS
CSCD
北大核心
2015年第3期470-476,共7页
Chinese Journal of Zoology
基金
国家自然科学基金项目(No.31171199)
上海市创新行动实验动物研究项目(No.11140900200
13140900300)
中央高校基本科研业务费专项资金
东华大学"励志计划"项目(No.B201308)