摘要
对SARS病人粪便样本直接测序,得到SRAS-CoV BJ202 全基因组序列(AY864806)。应用比较基因组研究方法对GenBank 中公布的 115 株 SARS-CoV 基因组序列以及 BJ202 进行分析。以 GZ02 序列为参照,发现 2 个以上基因组中同时存在单核苷酸多态(SNP)位点共278个。多态位点在SARS-CoV基因组中呈偏态分布,大约一半突变位点(50.4%, 140/278)发生在基因组3′末端1/3区域。编码Orf10-11、Orf3/4、E蛋白、M蛋白和S蛋白区域突变率较高。克隆并测序含有BJ202基因组12个多态位点的11个cDNA以及4个不含已知多态位点的cDNA片段(15个片段总长度为6.0 kb),结果显示:BJ202 特有的 3 个多态位点(13 804、15 031 和 20 792)以及另外 3 个多态位点(26 428、26 477 和 27 243)均检出两种不同核苷酸;位点18 379虽在已公布的115株SARS-CoV基因组中未发现突变,实际上也是多态位点。14个克隆中有8个克隆该位点为A,6个克隆为G。全部116个SARS-CoV基因组中共有18种缺失类型和2种插入类型。大部分缺失发生在编码ORF9和ORF10-11区域(基因组序列27 700-28 000 bp处)。以邻位连接法(Neighbor-Joining)构建了116株SARS-CoV系统发育树,BJ202 与BJ01和LLJ-2004等SARS-CoV的亲缘关系较接近。
In this work, severe acute respiratory syndrome associated coronavirus (SARS-CoV) genome B J202 (AY864806) was completely sequenced. The genome was directly accessed from the stool sample of a patient in Beijing. Comparative genomics methods were used to analyze the sequence variations of 116 SARS-CoV genomes (including B J202) available in the NCBI Gen- Bank. With the genome sequence of GZ02 as the reference, there were 41 polymorphic sites identified in BJ202 and a total of 278 polymorphic sites present in at least two of the 116 genomes. The distribution of the polymorphic sites was biased over the whole genome. Nearly half of the variations (50.4%, 140/278) clustered in the one third of the whole genome at the 3' end (19.0 kb-29.7 kb). Regions encoding Orfl0-11, Orf3/4, E, M and S protein had the highest mutation rates. A total of 15 PCR products (about 6.0 kb of the genome) including 11 fragments containing 12 known polymorphic sites and 4 fragments without identified polymorphic sites were cloned and sequenced. Results showed that 3 unique polymorphic sites of BJ202 (positions 13 804, 15 031 and 20 792) along with 3 other polymorphic sites (26 428, 26 477 and 27 243) all contained 2 kinds of nucleotides. It is interesting to find that position 18379 which has not been identified to be polymorphic in any of the other 115 published SARS-CoV genomes is actually a poly- morphic site. The nucleotide composition of this site is A (8) to G (6). Among 116 SARS-CoV genomes, 18 types of deletions and 2 insertions were identified. Most of them were related to a 300 bp region (27 700-28 000) which encodes parts of the putative ORF9 and ORF10-11. A phylogenetic tree illustrating the divergence of whole BJ202 genome from 115 other completely sequenced SARS-CoVs was also constructed. B J202 was phylogeneticly closer to B J01 and LLJ-2004.
基金
This work was supported by the Science Foundation of Wenzhou City(No. Y2003A005) and Zhejiang University ( No. 181130-544301).
关键词
SARS相关冠状病毒
基因组
多态性
severe acute respiratory syndrome associated coronavirus (SARS-CoV)
genome
polymorphism