摘要
二代测序技术的涌现推动了基因组学研究,特别是在疾病相关的遗传变异研究中发挥了重要作用.虽然大多数遗传变异类型都可以借助于各种二代测序分析工具进行检测,但是仍然存在局限性,比如短串联重复序列的长度变异.许多遗传疾病是由短串联重复序列的长度扩张导致的,尤其是亨廷顿病等多种神经系统疾病.然而,现在几乎没有工具能够利用二代测序检测长度大于测序读长的短串联重复序列变异.为了突破这一限制,我们开发了一个全新的方法,该方法基于双末端二代测序辨识短串联重复序列长度变异,并可估计其扩张长度,将其应用于一项基于全外显子组测序的运动神经元疾病临床研究中,成功地鉴定出致病的短串联重复序列长度扩张.该方法首次原创性地利用测序读长覆盖深度特征来解决短串联重复序列变异检测问题,在人类遗传疾病研究中具有广泛的应用价值,并且对于其他二代测序分析方法的开发具有启发性意义.
Next generation sequencing(NGS) technologies boosted genomic and medical research,particularly for identification of disease-causing variants.Although most types of genetic variants could be identified through NGS data analysis,there are still some limitations,such as length variations of short tandem repeats(STRs).Many genetic diseases are known to be caused by expansions of STRs,especially neurological disorders,such as Huntington disease.However,almost none of existing tools could detect STRs expanded longer than sequencing read length based on NGS.To break through the limitation,we developed a novel method for detecting length variations of STRs and estimating the length of expansions based on paired-end NGS.We applied our method in a clinical study of motor neuron disease using whole-exome sequencing and successfully identified a disease-causing expansion of STR.Our method firstly used special features of depth of read coverage at STRs to address the variant calling problem.It has widely application value in human genetic disease research and inspirational value in developing new NGS data processing tools.
出处
《生物化学与生物物理进展》
SCIE
CAS
CSCD
北大核心
2016年第8期768-777,共10页
Progress In Biochemistry and Biophysics
基金
supported by grants from The National Natural Science Foundation of China(31171274)
National Basic Research Program of China(2012C B725203)~~
关键词
二代测序
短串联重复序列
长度变异
运动神经元疾病
next generation sequencing
short tandem repeats
length variation
motor neuron disease