摘要
拷贝数变异(copy number variation,CNV)是基因组结构变异中的一个重要类型,它在人类很多复杂疾病的发生和发展过程中扮演着重要角色。当前CNV的识别研究,主要集中在单一样本相对于参考序列的CNV识别,以及针对成对样本的CNV识别。然而,这种单纯基于个体水平的CNV分析,只能局限于个体之间而无法进行亲本到子代的遗传学分析。本文基于千人基因组计划中三样本父-母-子代的家系数据,寻找子代相对于父、母的变异区域,不仅识别出子女继承自父母的CNV,并通过分层聚类分析推断出这些CNV的生成方式,同时还检测出少量疑似子代相对于父母的纯合CNV变异。
Copy number variation(CNV) is an important type of genomic structural variation and plays a crucial role in genomic disorders imposed by diseases. Most of the current bioinformatic researches focus on developing algorithms and tools for detecting CNVs from single or paired datasets, but the analysis of such CNVs is not sufficient from a family- based genetic point of view. We performed a trio- sample family based parents- offspring CNV analysis using the 1000 G data. We found a number of CNVs that the offsprings inherited from their parents and inferred through hierarchical analysis how they were generated. In addition, we also discovered several de novo CNV candidates.
出处
《南方医科大学学报》
CAS
CSCD
北大核心
2015年第6期777-782,共6页
Journal of Southern Medical University
基金
国家自然科学基金(91131013)
青年项目(31100952)~~
关键词
拷贝数变异
序列覆盖度
分层聚类
copy number variation
read depth
hierarchical clustering