摘要
SNP(单核苷酸多态性)是发生在DNA序列上单个核苷酸碱基之间的变异,是生物可遗传变异中最常见的一种变异。ED算法和SNP-index算法是计算SNP位点的2种常用算法。由高通量测序获得拟南芥F_(2)代全基因组测序数据,基于Linux平台对测序数据进行过滤、筛选和比对,通过算法实现结果,比较不同算法检测得到的SNP位点数量和SNP基因型比例。实验结果表明,通过ED算法得到的SNP位点数量更多,分布更广,相对分布密度大于SNP-index算法的,但是2种算法得到的SNP位点数量和SNP基因型比例相近。
SNP(Single Nucleotide Polymorphism)is the most common variation in biological heritable variation,which occurs between single nucleoside acid-base groups in DNA sequence.ED algorithm and SNP-index algorithm are two commonly used algorithms to calculate SNP sites.The whole genome sequencing data of F_(2) generation of arabidopsis thaliana are obtained by high-throughput sequencing.The sequencing data are filtered,screened and compared based on Linux platform.The number of SNP sites and the proportion of SNP genotypes detected under different algorithms are compared.The experimental results show that the number of SNP sites obtained by ED algorithm is more and more widely distributed than SNP index algorithm,and the relative distribution density is larger than that of SNP index algorithm,but the number of SNP sites and the proportion of SNP genotypes obtained by the two algorithms are similar.
作者
甘秋云
GAN Qiu-yun(School of Applied Science and Engineering,Fuzhou Institute of Technology,Fuzhou 350014,China)
出处
《计算机工程与科学》
CSCD
北大核心
2022年第4期707-712,共6页
Computer Engineering & Science
基金
福州理工学院校级科研基金(FTKY21053)。