期刊文献+

基于枚举策略的三倍体个体单体型重建算法

Triploid Individual Haplotype Reconstruction Algorithm Based on Enumeration Strategy
下载PDF
导出
摘要 求解三倍体个体单体型对于探索三倍体物种的遗传特性和表型差异等方面的研究具有重要的推动作用。针对带基因型信息的最少错误更正(MEC/GI)模型,提出了一种基于枚举策略的三倍体个体单体型重建算法EHTR。该算法依次重建3条单体型上的每一个单核苷酸多态性位点取值,对于给定位点,首先根据其基因型取值枚举该位点的3种单体型取值情况,然后选择片段支持度最高的取值作为该位点的重建值,算法的总时间复杂度为O(mn+mlogm+cnl)。采用CELSIM和MetaSim两种测序片段模拟生成器生成实验测试数据,在片段覆盖率、错误率、单片段长度、单体型长度和单体型海明距离等参数的不同设置下,对算法EHTR,GTIHR,W-GA和Q-PSO的重建率和运行时间进行对比分析。实验结果显示,算法EHTR在不同的参数设置下均能以更短的运行时间获得更高的重建率。 Determining triploid individual haplotype plays an important role in promoting the study of exploring genetic traits and phenotypic differences of triploid species. In this paper,based on the minimum error correction with genotype information (MEC/GI) model, an enumeration strategy based enumeration haplotyping triploid (EHTR) algorithm was proposed for solving triploid individual haplotype reconstruction problem. The EHTR algorithm reconstructs the SNP sites of the three haplotypes one after another. When reconstructing a given SNP site,it enumerates three kinds of SNP values in terms of the genotype of the site, and chooses one with the most high support degree coming from the SNP fragments that are covering the corresponding SNP site The total time complexity is O(rnn+rnlogrn-bcnl). In the ex- periments, two kinds of simulators CELSIM and MetaSim were invoked to generate SNP fragments. The reconstruction rate and running time were compared and analyzed among algorithms EHTR, GTIHR, W-GA and Q-PSO with different parameters setting, such as fragment coverage, error rate, single fragment length, haplotype length and haplotype ham- ruing distance. Under different parameter setting, the EHTR algorithm can obtain higher reconstruction rate in shorter running time, which is proved by a number of experiments.
作者 张倩 吴璟莉 ZHANG Qian WU Jing-li(College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China Guangxi Key Lab of Multi-source Information Mining ~ Security, Guangxi Normal University, Guilin 541004, China Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guilin 541004, China)
出处 《计算机科学》 CSCD 北大核心 2017年第1期75-79,112,共6页 Computer Science
基金 国家自然科学基金项目(61363035 61502111) 广西自然科学基金项目(2015GXNSFAA139288 2013GXNSFBA019263 2012GXNSFAA053219) "八桂学者"工程专项 广西多源信息挖掘与安全重点实验室系统性研究基金项目(14-A-03-02 15-A-03-02)资助
关键词 序列分析 三倍体 单体型 基因型 最少错误更正 枚举 Sequence analysis, Triploid, Haplotype, Genotype, Minimum error correction, Enumeration
  • 相关文献

参考文献2

二级参考文献19

  • 1Greenberg, H.J., Hart, W.E., Lancia, G. Opportunities for combinatorial optimization in computational Biology. INFORMS Journal on Computing, 16(3): 211-231 (2004)
  • 2Lippert, R. Schwartza, R., Lancia, G., Istrail, S. Algorithmic strategies for the SNPs haplotype assembly problem. Briefings in Bioinformatrics, 3(1): 23-31 (2002)
  • 3Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz,R. SNPs problems, complexity and algorithms.In Proceedings of Annual European Symposium on Algorithms (ESA), volume 2161, Lecture Notes in Computer Science, 182-193, Springer-Verlag, Berlin, Heidelberg, 2001
  • 4Lund, C., Yannakakis, M. The approximation of maximum subgraph problems. In Proceedings of 20th Int.Colloqium on Automata, Languages and Programming, 40-51 Springer-Verlag Berlin, Heidelberg, 1994
  • 5Rizzi, R. Bafna, V., Istrail, S., Lancia, G. Practical algorithms and fixed-parameter tractability for the single individual SNP haploityping problem. In R. Guigo and D. Gusfield, editors. Proceedings of 2nd Annual Workshop on Algorithms in Bioinformaticcs (WABI), volum 2452 of Lecture Notes in Computer Science, 29-43, Springer-Verlag, Berlin, Heidelberg, 2002
  • 6Tardos, E. A strongly polynomial minimum cost circulation algorithm. Combinatorica, 5(3): 24-255(1985)
  • 7Venter, J. et al. The sequence of the human genome. Science, 291:1304-1351 (2001)
  • 8Weber, J., Myers. E. Human whole genome shotgun sequencing. Genome Research, 7:401-409 (1997)
  • 9Lancia G,Bafna V, Istrail S, et al. SNPs problems, complexity and algorithms [ C ]. In:Proceedings of the 9th Annual European Sym- posium on Algorithms, London: Springer-Verlag ,2001 : 182-193.
  • 10Filippo G. A comparison of several algorithms for the single indi- vidual SNP haplotyping reconstruction problem [ J ]. Bioinformat- ics,2010,26(18) :2217o2225.

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部