摘要
区域捕获测序是针对基因组特定区段如对MHC (Major histocompatibility complex)区域、外显子区域等测序的有效手段,但是由于捕获测序中探针设计不均匀而造成区域内测序深度变异很大,因此,与基于全基因组的测序数据相比,其拷贝数变异的检测难度更大。目前已经出现了捕获测序下拷贝数变异(copy number variations, CNV)的检测方法,但对CNV的检测准确性仍然很低,特别是对于低频率CNV来说效果极差。因此,本研究开发了一个新的拷贝数变异检测方法,其特点是:(1)以区域内划分的区间为单位检测区间内的CNV,而不是直接对每个个体检测CNV;(2)全面利用群体内所有个体信息,通过区间内read深度在群体的分布规律来检测CNV的分离规律,假设区间内只有1个CNV,那么区间内的read深度将服从三峰的混合正态分布。将该方法应用于21 327个银屑病个体区域捕获测序的CNV检测中,结果表明,XHMM,ExomeDepth和本方法跟金标准重叠的窗口总数与金标准总窗口数的百分比(即重叠率)分别是7%、18%和62%。与XHMM和ExomeDepth相比,新方法在区间内CNV检测覆盖度可以分别提高55个百分点和44个百分点。本研究完善拷贝数变异检测方法,为疾病的诊断治疗提供一定的理论依据。
Target region capture sequencing is an effective method for sequencing specific regions on genome, such as MHC region or exon regions. However, due to the uneven design of probes in capture sequencing, the sequencing depth is extremely variable in capture region, compared with whole-genome sequencing data, copy number variation detection is more difficult. So far, several CNV detection methods for target region capture sequencing have been developed, but the detection of CNV, especially for low frequency CNV, is still relative inefficient. Therefore, we developed a new CNV detection method suitable for target region capture sequencing and whole genome sequencing(WGS) in this study. The characteristics of the new method are:(1) It detects the CNV by units of intervals within the region, instead of each individual;(2) The study utilizes all the individual information in the group,and detects the CNV through the distribution of read depth in the region. If there is only one CNV in the interval,the read depth in the interval will follow mixed normal distribution of the three peaks. The study applied the new method to CNV detection of 21 327 psoriasis individual with target sequencing in MHC region, the results showed that the accuracies of XHMM, ExomeDepth and our method is 7%, 18%, and 62%, respectively;and the coverage of CNV detection with the new method is increased by 55 and 44 percentage points than XHMM and ExomeDepth,respectively. The new CNV detection method provided a theoretical basis for the disease diagnosis and treatment.
作者
杨浩
姜丹
方铭
Yang Hao;Jiang Dan;Fang Ming(College of Life Science and Technology,Heilongjiang Bayi Agricultural University,Daqing,163319;Fisheries College,Jimei University,Xiamen,361021)
出处
《基因组学与应用生物学》
CAS
CSCD
北大核心
2021年第1期435-441,共7页
Genomics and Applied Biology
基金
国家自然科学基金面上项目(31672399,31872560)资助。