期刊文献+

基于重叠信息的基因组测序短片段定位算法

Maximum use of reads overlap information for short reads mapping
下载PDF
导出
摘要 提出了一种新的测序短片段定位算法Umap,算法引入核心片段逐步扩展延伸的基本思想,通过短片段间的重叠信息定位短片段.首先找出所有在参考基因组上只出现一次的短片段,称为唯一短片段.然后以唯一短片段为基础,利用短片段间的重叠信息,使用贪婪算法对唯一短片段进行扩展,进而确定其他非唯一短片段的准确位置.实验表明,该算法对短片段的定位比现有短片段定位算法更加准确,能够定位的短片段数目更多,匹配的短片段比率达到71%.通过利用客观存在于短片段间的重叠信息,可以更加准确地在参考基因组上对短片段在参考基因组上进行定位,减少模糊匹配. A new short reads mapping algorithm Umap is presented here.Short reads are mapped to the reference genome using the main thought of contig extension based on reads overlap information.The unique reads which match only one position in the reference genome are found at first.Then,these unique reads are extended by greedy algorithm,and finally the un-unique reads' position in the reference genome are found.The experiments show that Umap can map short reads more accurately.And up to 71% short reads can be mapped to the reference genome.Taking advantages of the overlap information,short reads can be mapped to the reference genome more accurately.
出处 《东南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第1期63-66,共4页 Journal of Southeast University:Natural Science Edition
基金 国家自然科学基金资助项目(60671018 60771024)
关键词 短片段 唯一子串 唯一短片段 片段重叠信息 short reads unique k-tuple unique short reads overlap information
  • 相关文献

参考文献11

  • 1Mcpheron John D. Next-generation gap [ J ]. Nature Methods, 2009, 11(6) :S2 - S5.
  • 2Altschul S F, Gish W, Miller W,et al. Basic local align- ment search tool[J]. J Mol Biol, 1990, 215(3) :403 -410.
  • 3Ning Z, Cox A J, Mullikin J C. SSAHA: a fast search method for large DNA databases [ J]. Genome Res, 2001. 11(10) :1725 - 1729.
  • 4Li H,, Ruan J, Durbin R. Mapping short DNA sequen- cing reads and calling variants using mapping quality scores[ J]. Genome Res, 2008, 18( 11 ) : 1851 - 1858.
  • 5Lin H, Zhang Z, 2hang M Q, et al. ZOOM! zillions of oli- gos mapped[ J ]. Bioinformatics, 2308, 24(21 ):2431 -2437.
  • 6Campagna D, Albiero A, Bilardi A, et al. PASS: a program to align short sequences [J ]. Bioinformatics, 2009, 25(7) :967 -968.
  • 7Li R, Li Y, Krisfiansen K, et al. SOAP: short oligonucleofide alignment program[J]. Bioinformatics, 2008, 24(5):713-714.
  • 8Burrows M, Wheeler D J. A block-sorting lossless data compression algorithm [ R]. Technical Report 124, America: Digital Equipment Corporation, 1994.
  • 9Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA sequences[J]. Genome Biology, 2009, 10 (3) : R25.
  • 10Li H, Durbin R. Fast and accurate short read align- ment with burrows--wheeler transform [ J ]. Bioinfor- matics, 2009, 25(14):1754- 1760.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部