Tandem duplication(TD)is a major type of structural variations(SVs)that plays an important role in novel gene formation and human diseases.However,TDs are often missed or incorrectly classified as insertions by most m...Tandem duplication(TD)is a major type of structural variations(SVs)that plays an important role in novel gene formation and human diseases.However,TDs are often missed or incorrectly classified as insertions by most modern SV detection methods due to the lack of specialized operation on TD-related mutational signals.Herein,we developed a TD detection module for the Pindel tool,referred to as Pindel-TD,based on a TD-specific pattern growth approach.Pindel-TD is capable of detecting TDs with a wide size range at single nucleotide resolution.Using simulated and real read data from HG002,we demonstrated that Pindel-TD outperforms other leading methods in terms of precision,recall,F1-score,and robustness.Furthermore,by applying Pindel-TD to data generated from the K562 cancer cell line,we identified a TD located at the seventh exon of SAGE1,providing an explanation for its high expression.Pindel-TD is available for non-commercial use at https://github.com/xjtu-omics/pindel.展开更多
基金supported by the National Key R&D Program of China(Grant No.2022YFC3400300)the National Natural Science Foundation of China(Grant Nos.62172325,32125009,and 32070663).
文摘Tandem duplication(TD)is a major type of structural variations(SVs)that plays an important role in novel gene formation and human diseases.However,TDs are often missed or incorrectly classified as insertions by most modern SV detection methods due to the lack of specialized operation on TD-related mutational signals.Herein,we developed a TD detection module for the Pindel tool,referred to as Pindel-TD,based on a TD-specific pattern growth approach.Pindel-TD is capable of detecting TDs with a wide size range at single nucleotide resolution.Using simulated and real read data from HG002,we demonstrated that Pindel-TD outperforms other leading methods in terms of precision,recall,F1-score,and robustness.Furthermore,by applying Pindel-TD to data generated from the K562 cancer cell line,we identified a TD located at the seventh exon of SAGE1,providing an explanation for its high expression.Pindel-TD is available for non-commercial use at https://github.com/xjtu-omics/pindel.