期刊文献+

基于MapReduce的基因读段定位改进算法 被引量:1

Improved Gene Read Mapping Algorithm Based on MapReduce
下载PDF
导出
摘要 由于高通量测序技术产生了海量基因读段数据,并行的基因读段定位算法成为近年来的研究热点。对基因匹配算法进行研究,提出了一种基于MapReduce的基因读段定位改进算法,并且通过在读段定位过程中融入生物信息以及利用Hadoop分布式缓存机制,在一定程度上降低了算法的复杂度。在拟南芥菜基因数据集上进行的实验表明,该算法能够有效提高算法执行效率,减少算法执行时间。 Parallel read mapping algorithms become a hotspot in recent years, since the high-throughput sequence technology generates massive reads. Genetic matching algorithm was studied and an improved gene read mapping algorithm which could reduce the complexity of the algorithm by using Hadoop distributed cache mechanism and integrating biological information was proposed. The experimental results on the Arabidopsis gene data sets show that the proposed improved algorithm can effectively improve the algorithm efficiency and reduce the algorithm running time.
出处 《计算机科学》 CSCD 北大核心 2015年第8期82-85,共4页 Computer Science
基金 国家自然科学基金(61272222 61003116) 江苏省自然科学基金重点重大专项(BK2011005) 江苏省自然科学基金(BK2011782) 江苏省普通高校研究生科研创新计划项目(CXLX12_0415)资助
关键词 读段定位 MAPREDUCE SeqMap Read mapping, MapReduce, SeqMap
  • 相关文献

参考文献9

  • 1Jiang H, Wong W H. SeqMap.. mapping massive amount of oli- gonucleotides to the genome[J]. Bioinformatics, 2008,24 (20) .. 2395-2396.
  • 2Langmead B, Trapnell C, Pop M. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome[J]. Genome Biol,2009,10(3) : 25.
  • 3Wang K, Singh D, Zeng Z. MapSplice: accurate mapping of RNA-seq reads [or splice junction discovery[J]. Nucleic Acids Res,2010,38(18) : 178.
  • 4王曦,汪小我,王立坤,冯智星,张学工.新一代高通量RNA测序数据的处理与分析[J].生物化学与生物物理进展,2010,37(8):834-846. 被引量:64
  • 5Homer N, Merriman B, Nelson S F. BFAST: an alignment tool for large scale genome resequeneing[J]. PLoS One, 2009,4 (11) : 7767.
  • 6Smith T F,Waterman M S. Identification of common molecular subsequences[J]. J Mol Bio1,1981,147(1) : 195-197.
  • 7Dean J,Ghemawat S. MapReduce: Simplified data processing on large clusters[J]. ACM, 2008,51(1) : 137-150.
  • 8Schatz M C. CloudBurst: highly sensitive read mapping with Map- Reduce[J]. Bioinformatics, 2009,25(11) : 1363-1369.
  • 9涂金金,杨明,郭丽娜.基于MapReduce的基因读段定位算法[J].模式识别与人工智能,2014,27(3):206-212. 被引量:2

二级参考文献100

  • 1Marioni J C, Mason C E, Mane S M, et al. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res, 2008, 18(9): 1509-1517.
  • 2Mortazavi A, Williams B A, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods, 2008, 5(7): 621-628.
  • 3Nagalakshmi U, Wang Z, Waem K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science, 2008, 320(5881): 1344-1349.
  • 4Sultan M, Schulz M H, Richard H, et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science, 2008, 321(5891): 956-960.
  • 5Wang E T, Sandberg R, Luo S, etal. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470-476.
  • 6Birzele F, Schaub J, Rust W, et al. Into the unknown: expression profiling without genome sequence information in CHO by next generation sequencing. Nucleic Acids Res, 2010, doi: 10.1093/nar/ gkq 116.
  • 7Sanger F, Nicklen S, Coulson A R. DNA sequencing with chain- terminating inhibitors. Proc Natl Acad Sci USA, 1977, 74 (12): 5463 -5467.
  • 8Margulies M, Egholm M, Altman W E, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature, 2005, 437(7057): 376-380.
  • 9Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol, 2008, 26(10): 1135 1145.
  • 10Ruparel H, Bi L, Li Z, et al. Design and synthesis of a 3'-O-allyl photocleavable fluorescent nucleotide as a reversible terminator for DNA sequencing by synthesis. Proe Natl Acad Sci USA, 2005, 102(17): 5932-5937.

共引文献64

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部