期刊文献+

利用K均值聚类算法识别遗传疾病致病SNP位点 被引量:1

Recognition of Risk SNPs Related to Genetic Diseases Based on K-means Clustering Algorithm
下载PDF
导出
摘要 通过识别与遗传疾病致病相关的SNP(Single Nucleotide Polymorphism)位点在染色体中的位置,可以帮助人们干预这些致病位点,从而防止遗传性疾病的发生或者进行畜禽的抗病育种。利用K均值聚类算法对每一个位点的数值编码进行聚类并计算其正确率,再利用箱型图识别极端异常值的方法筛选致病SNP位点,最后采用卡方检验对筛选结果的有效性进行验证。结果表明:K均值聚类算法不但准确识别出了遗传疾病的致病SNP位点,而且识别速度远高于目前普遍使用的逻辑斯蒂回归和随机森林算法。因此,该研究基于K均值聚类算法提出了一种识别遗传疾病致病SNP位点的新方法,为实时处理大规模畜禽基因数据集提供了一种新的思路。 Recognizing the location of the pathogenic Single Nucleotide Polymorphism(SNP)loci on chromosomes could help us to intervene and thus prevent some genetic diseases from occurring,or in the use of these SNP sites for disease resistance breeding of livestock and poultry,so identification of pathogenic SNP sites is of great significance in animal husbandry and veterinary science.In this paper,a new method for identifying pathogenic loci of genetic diseases was proposed based on K-means clustering algorithm.Firstly,K-means clustering algorithm was used to cluster the numerical codes of each SNP and the accuracy of clustering was calculated.Then the method of box graph identification of extreme outliers was used to screen risk SNPs.Finally,Chi-square test was used to verify the effectiveness of the screening results.The experimental results of simulated data sets and real data sets showed that k-means clustering algorithm not only accurately identified the pathogenic SNP sites of human genetic diseases,but also recognized them much faster than logistic regression and random forest algorithms commonly used at present.It provided a new way to process large-scale livestock and poultry gene data sets online.
作者 张恒益 郑惠玲 ZHANG Hengyi;ZHENG Huiling(College of Animal Science and Technology,Northwest A&F University,Yangling,Shaanxi 712100,China)
出处 《家畜生态学报》 北大核心 2020年第12期25-31,共7页 Journal of Domestic Animal Ecology
基金 陕西省农业厅科技创新转化项目(NYKJ-2020-YL-05)。
关键词 K均值聚类算法 致病SNP位点 箱型图 卡方检验 K-means clustering algorithm risk SNPs Box-plot Chi-square test
  • 相关文献

参考文献9

二级参考文献65

共引文献278

同被引文献10

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部