摘要
通过Illumina HiSeq高通量测序平台,基于RNA-Seq测序技术,利用MISA及GATK分析方法探究青藏高原生境下水毛茛转录本中SSR和SNP位点信息。结果共得到297224条Unigenes,0~2000 bp长度范围的序列占总序列的97.40%。共搜索到26086个SSR位点,SSR的分布频率为8.78%,分布密度为1/12.8 kb。50%的SSR长度在10~14 bp,只有1%的SSR的长度超过100 bp。水毛茛SSR中主要重复类型为单核苷酸重复,占54.70%;其次为三核苷酸重复,占24.71%。单核苷酸重复类型中,A/T类型占96.52%;三核苷酸重复类型共有10种,最多的是AAG/CTT,占22.17%。水毛茛转录组中SSR单元重复次数大于5时,SSR的基元以单核苷酸为主;当重复次数小于5时,三核苷酸是主要的重复基元。成功搜索到8712752个SNP位点,SNP的分布密度为1/38 bp。SNP类型统计中纯合型SNP是杂合型SNP的3倍。SNP位点统计中,转换类型占64.4%,颠换类型占35.6%,SNP变异类型以转换类型为主。SNP测序深度统计发现在≤30范围内,SNP数目最多,占53.84%;其次在31~100范围内,占32.8%;在401~500范围内最少,仅占0.04%;当测序深度大于500时,SNP个数为0。研究结果为青藏高原水毛茛保护、繁育、遗传多样性及其适应青藏高原极端环境的分子机制研究等工作提供科学基础。
This study used the Illumina HiSeq 2500 high-throughput sequencing platform,based on RNA sequencing,and used MISA and GATK analysis methods to explore the SSR and SNP location information in RNA-seq data of Batrachium bungei in the Qinghai-Tibet Plateau environment.Results a total of 297224 Unigenes sequences with a length range of 0~2000 bp were obtained,accounting for 97.40% of the total sequences.A total of 26086 SSRs loci were searched.The frequency of SSR was 8.78% and the distribution density was 1/12.8 kb.50% of SSRs are 10~14 bp in length,and only 1% of SSRs are more than 100 bp in length.The main repeat type in SSR of Batrachium bungei is mononucleotide repeat,accounting for 54.70%,followed by trinucleotide repeat,accounting for 24.71%.Among the mononucleotide repeat types,A/T type accounts for 96.52%,and there are 10 trinucleotides repeat types,the most is AAG/CTT,accounting for 22.17%.When the number of SSR unit repeats in Batrachium bungei transcriptome is greater than 5,mononucleotide is the main motif of SSR.When the number of repeats is less than 5,trinucleotide is the main repeat motif.8712752 SNP loci were successfully searched,and the distribution density of SNP was 1/38 bp.In SNP type statistics,homozygous SNP was three times that of heterozygous SNP.In the statistics of SNP loci,conversion types accounted for 64.4%,transversion types accounted for 35.6%,and SNP variation types were mainly conversion types.Through the statistics of SNP sequencing depth,we found that the number of SNPs was the largest in the range of ≤30,accounting for 53.84%;Secondly,in the range of 31~100,accounting for 32.8%;It is the least in the range of 401~500,accounting for only 0.04%.When the sequencing depth is greater than 500,the number of SNPs is 0.The results of this study will provide a scientific basis for the protection,breeding,genetic diversity and molecular mechanism of Batrachium bungei adapting to the extreme environment of the Qinghai-Tibet Plateau.
作者
陈忠海
刘泰龙
陈飞飞
吴玄峰
赵宁
刘星
CHEN Zhong-hai;LIU Tai-long;CHEN Fei-fei;WU Xuan-feng;ZHAO Ning;LIU Xing(College of Science,Tibet University,Lhasa 850000,China;College of Life Science,Wuhan University,Wuhan 430072,China)
出处
《环境生态学》
2021年第11期53-58,共6页
Environmental Ecology
基金
国家自然科学基金项目(31860046)
(藏财预指〔2020〕1号)生态学“黄大年式”教师团队建设
西藏自治区自然科学基金(XZ2019ZR G-12(Z))联合资助。