With the development of sequencing technologies,somatic mutation analysis has become an important component in cancer research and treatment.VarDict is a commonly used somatic variant caller for this task.Although the...With the development of sequencing technologies,somatic mutation analysis has become an important component in cancer research and treatment.VarDict is a commonly used somatic variant caller for this task.Although the heuristic-based VarDict algorithm exhibits high sensitivity and versatility,it may detect higher amounts of false positive variants than callers,limiting its clinical practicality.To address this problem,we propose DeepFilter,a deep-learning based filter for VarDict,which can filter out the false positive variants detected by VarDict effectively.Our approach trains two models for insertion-deletion mutations(InDels)and single nucleotide variants(SNVs),respectively.Experiments show that DeepFilter can filter at least 98.5%of false positive variants and retain 93.5%of true positive variants for InDels and SNVs in the commonly used tumor-normal paired mode.Source code and pre-trained models are available at https://github.com/LeiHaoa/DeepFilter.展开更多
基金This work was partially supported by the National Natural Science Foundation of China(NSFC)(Nos.62102231 and 61972231)the Shenzhen Basic Research Fund(No.JCYJ20180507182818013)+3 种基金the Key Project of Joint Fund of Shandong Province(No.ZR2019LZH007)Shandong Provincial Natural Science Foundation(No.ZR2021QF089)the PPP project from CSC and DAADEngineering Research Center of Digital Media Technology,Ministry of Education,China.
文摘With the development of sequencing technologies,somatic mutation analysis has become an important component in cancer research and treatment.VarDict is a commonly used somatic variant caller for this task.Although the heuristic-based VarDict algorithm exhibits high sensitivity and versatility,it may detect higher amounts of false positive variants than callers,limiting its clinical practicality.To address this problem,we propose DeepFilter,a deep-learning based filter for VarDict,which can filter out the false positive variants detected by VarDict effectively.Our approach trains two models for insertion-deletion mutations(InDels)and single nucleotide variants(SNVs),respectively.Experiments show that DeepFilter can filter at least 98.5%of false positive variants and retain 93.5%of true positive variants for InDels and SNVs in the commonly used tumor-normal paired mode.Source code and pre-trained models are available at https://github.com/LeiHaoa/DeepFilter.