摘要
针对最少错误更正(the minimum error correction,MEC)模型,对三倍体个体单体型重建问题进行研究,提出一种基于遗传算法的三倍体个体单体型重建算法GTIHR.该算法采用一种新颖的染色体编码方法和一种有效的爬山算子.这种较短的染色体编码方式能够构造较小的解空间,以便于算法快速收敛到较优解.此外,爬山算子通过为染色体编码注入随机信息来避免早熟现象,并充分利用SNP片段中的有效信息来逐步修正染色体编码取值.实验利用鸟枪法测序模拟片段生成器CELSIM生成片段数据.与以往求解算法的比较分析结果显示,GTIHR算法能够获得更高重建率的单体型,具有较强的实用价值.
Triploid individual haplotype reconstruction problem is studied by using the minimum error correction ( MEC ) model. A ge- netic algorithm based method GTIHR is presented for reconstructing txiploid individual haplotype. A novel chromosome code and an effective clime operator are introduced for the algorithm. This kind of relative short chromosome code can construct relative small solu- tion space, which plays positive role in speeding up the convergence process. In addition, the clime operator plants random informa- tion into chromosome codes, which prevents premature convergence, and makes full use of the information in SNP fragments to adjust the chromosomes step by step. In the experiments, the shotgun assembly simulator CELSIM was invoked to generate SNP fragments. The results indicate that GTIHR can get higher reconstruction rate than previous algorithm while solving the MEC model, and it is practical for realistic applications.
出处
《小型微型计算机系统》
CSCD
北大核心
2014年第4期840-844,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61363015)资助
广西自然科学基金项目(2011GXNSFB018068)资助
广西高等学校科学技术项目(2013YB028)资助
关键词
单核苷酸多态性
三倍体
单体型
最少错误更正
遗传算法
SNP ( single nucleotide polymorphism )
tfiploid
haplotype
the minimum error correction { MEC }
genetic algorithm