期刊文献+

The Biological Significance of Multi-copy Regions and Their Impact on Variant Discovery

原文传递
导出
摘要 Identification of genetic variants via high-throughput sequencing(HTS)technologies has been essential for both fundamental and clinical studies.However,to what extent the genome sequence composition affects variant calling remains unclear.In this study,we identified 63,897 multi-copy sequences(MCSs)with a minimum length of 300 bp,each of which occurs at least twice in the human genome.The 151,749 genomic loci(multi-copy regions,or MCRs)harboring these MCSs account for 1.98% of the genome and are distributed unevenly across chromosomes.MCRs containing the same MCS tend to be located on the same chromosome.Gene Ontology(GO)analyses revealed that 3800 genes whose UTRs or exons overlap with MCRs are enriched for Golgirelated cellular component terms and various enzymatic activities in the GO biological function category.MCRs are also enriched for loci that are sensitive to neocarzinostatin-induced double-strand breaks.Moreover,genetic variants discovered by genome-wide association studies and recorded in dbSNP are significantly underrepresented in MCRs.Using simulated HTS datasets,we show that false variant discovery rates are significantly higher in MCRs than in other genomic regions.These results suggest that extra caution must be taken when identifying genetic variants in the MCRs via HTS technologies.
出处 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2020年第5期516-524,共9页 基因组蛋白质组与生物信息学报(英文版)
基金 supported by the National Natural Science Foundation of China(NSFC,Grant No.31771479) Science Fund for Creative Research Groups of the NSFC(Grant No.81521003) Projects of International Cooperation and Exchanges of NSFC(Grant No.61661146004) Municipal Planning Projects of Scientific Technology of Guangdong(Grant No.201804020083) the Science and Technology Program of Guangzhou(Grant No.201400000004) the Natural Science Foundation of Guangdong(Grant No.2015B050501006) the Team Program of Natural Science Foundation of Guangdong(Grant No.2014A030312002) the 1000 Talents Program of China。
  • 相关文献

参考文献1

二级参考文献98

  • 1Ng, S.B., Buckingham, K.J., Lee, C., Bigham, A.W., Tabor, H.K., Dent, K.M., Huff, C.D., Shannon, P.T., Jabs, E.W., Nickerson, D.A., Shendure, J., Bamshad, M.J., 2010a. Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42, 30--35.
  • 2Ng, S.B., Bigham, A.W., Buckingham, K.J., Hannibal, M.C., McMillin, M.J., Gildersleeve, H.I., Beck, A.E., Tabor, H.K., Cooper, G.M., Mefford, H.C., Lee, C., Turner, E.H., Smith, J.D., Rieder, M.J., Yoshiura, K., Matsumoto, N., Ohta, T., Niikawa, N., Nickerson, D.A., Bamshad, M.J., Shendure, J., 2010b. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42. 790-793.
  • 3Ng, S.B., Turner, E.H., Robertson, P.D., Flygare, S.D., Bigham, A.W., Lee, C., Shaffer, T., Wong, M., Bhattacharjee, A., Eichler, E.E., Bamshad, M., Nickerson, D.A., Shendure, J., 2009b. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461,272-276.
  • 4Novelli, G., Predazzi, I.M., Mango, R., Romeo, E, Mehta, J.L., 2010. Role of genomics in cardiovascular medicine. World J. Cardiol. 2, 428-436.
  • 5Pettersson, E., Lundeberg, J., Ahmadian, A., 2009. Generations of sequencing technologies. Genomics 93, 105-111.
  • 6Pop, M., Salzberg, S.L., 2008. Bioinformatics challenges of new sequencing technology. Trends Genet. 24, 142--149.
  • 7Pussegoda, K.A., 2010. Exome sequencing: locating causative genes in rare disorders. Clin. Genet. 78, 32--33.
  • 8Ramensky, V., Bork, E, Sunyaev, S., 2002. Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 30, 3894--3900.
  • 9Rehman, A.U., Morell, R.J., Belyantseva, I.A., Khan, S.Y., Boger, E.T. Shahzad, M., Ahmed, Z.M., Riazuddin, S., Khan, S.N., Friedman, T.B. 2010. Targeted capture and next-generation sequencing identifies C9orf75 encoding taperin, as the mutated gene in nonsyndromic deafness DFNB79 Am. J. Hum. Genet. 86, 378--388.
  • 10Rios, J., Stein, E., Shendure, J., Hobbs, H.H., Cohen, J.C., 2010. Identification by whole-genome resequencing of gene defect responsible for severe hypercholesterolemia. Hum. Mol. Genet. 19, 4313--4318.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部