摘要
【目的】探讨基因组变异检测的分析流程及其局限性。【文献范围】本文收集并综述了与基因组变异检测流程相关的研究文献。【方法】首先简要概述了基因组变异检测分析流程,深入介绍数据质控的3个关键环节:原始数据质量控制、比对质量控制和变异调用质量控制。接着,从read比对、排序和去除重复序列三方面介绍数据预处理。随后,针对变异检测,从变异数据检测、质控、过滤和注释4个方面进行总结。最后,对存在的问题进行总结和展望。【结果】随着下一代测序技术的发展,基因组变异检测流程将变得更加高效、精确且具可扩展。【局限】面临测序长度限制和在临床实验室中的应用验证等挑战。【结论】该流程对了解基因组变异检测分析流程的现状及其发展趋势具有重要的研究意义。
[Objective]This paper discusses the analysis workflow and limitations of genomic variation detection.[Scope of the literature]We collected and reviewed the related literature on the genomic variation detection workflow.[Methods]Firstly,a brief overview of the genomic variation detection analysis workflow is provided,followed by an in-depth discussion of the three key aspects of data quality control:raw data quality control,alignment quality control,and variant calling quality control.Data preprocessing will be introduced in terms of read alignment,sorting,and duplicate removal.Subsequently,the variation detection process are summarized,encompassing variant data monitoring,quality control,filtering,and annotation.Finally,an overview and prospects of the existing challenges are presented.[Results]With the advancement of next-generation sequencing technology,the genomic variation detection process is becoming more efficient,accurate,and scalable.[Limitations]Challenges include limitations in sequencing read length and the need for validation in clinical laboratory applications.[Conclusions]This workflow is of research significance for understanding the current status and developmental trends of genomic variation detection analysis.
作者
栾海晶
牛北方
LUAN Haijing;NIU Beifang(China Computer Network Information Center,Chinese Academy of Sciences,Beijing 100083,China;China University of Chinese Academy of Sciences,Beijing 100049,China)
基金
国家自然科学基金(92259101)
中国科学院战略性先导科技专项(B类)(XDB38040100)。
关键词
数据预处理
质量控制
变异检测
全基因组测序
变异注释
data preprocessing
quality control
mutation detection
whole genome sequencing
mutation annotation