期刊文献+

一种面向SNP选择的K-Center算法

A K-CENTER ALGORITHM FOR SNP SELECTION
下载PDF
导出
摘要 单核苷酸多态性(Single Nucleotide Polymorphism,SNP)数据是一种关于遗传病理学研究的重要数据,其高维少样本,存在大量噪声和冗余,并且SNP位点之间存在连锁不平衡性,因此需要对SNP数据进行降维。提出一种改进的K-Center算法——K-MSU算法。使用K-Center进行数据降维,在K-Center算法的距离度量中引入对称不确定性,解决SNP数据之间的连锁不平衡性;针对K-Center算法的随机选择初始聚类中心的方法容易对聚类结果产生较大的影响,使用基于信息增益的密度方法去选择初始聚类中心。在医院提供的临床实验数据的实验结果表明,K-MSU算法在SNP选择中具有更高的分类准确率和较好的效果。 SNP(Single nucleotide polymorphism)data is a kind of important data about genetic pathology research.It has high dimension with a few samples,a lot of noise and redundancy,and there is a chain imbalance between SNP loci.Therefore,it is necessary to reduce the dimension of SNP data.This paper proposes an improved K-Center algorithm——K-MSU algorithm.It used K-Center for data dimension reduction,and symmetric uncertainty was introduced into the distance measurement of the K-Center algorithm to solve the linkage imbalance between SNP data.The method of random selection of initial clustering center based on K-Center algorithm was easy to have a great impact on the clustering results,so we used the density method based on information gain to select the initial clustering center.The experimental results of clinical trial data provided by the hospital show that K-MSU algorithm in SNP selection has higher classification accuracy and the better effect.
作者 曹莉敏 周从华 Cao Limin;Zhou Conghua(School of Computer Science and Telecommunication Engineering,Jiangsu University,Zhenjiang 212013,Jiangsu,China)
出处 《计算机应用与软件》 北大核心 2020年第9期227-234,共8页 Computer Applications and Software
基金 江苏省重点研发计划(社会发展)项目(BE2016630,BE2017628) 无锡市卫生计生委科研项目(Z201603)。
关键词 单核苷酸多态 SNP选择 K-Center 特征选择 对称不确定性 信息增益 Single nucleotide polymorphisms SNP selection K-Center Feature selection Symmetric uncertainty Information gain
  • 相关文献

参考文献3

二级参考文献27

  • 1袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:152
  • 2Han Jiawei,Kamber M.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2007.
  • 3Lai Por-Shen, Fu Hsin-Chia.Variance enhanced K-medoid clus- tering[J].Expert Systems with Applications,2011,38:764-775.
  • 4Zhang Xueping,Ding Wei.Spatial clustering with obstacles constraints using PSO-DV and K-Medoids[C]//International Conference on Intelligent System and Knowledge Engineer- ing,2008:246-251.
  • 5Yang Tengfei, Zhang Xueping.Spatial clustering algorithm with obstacles constraints by quantum Particle Swarm Optimization and K-Medoids[C]//Second International Conference on Com- putational Intelligence and Natural Computing,2010: 105-108.
  • 6Gao Weifeng,Liu Sanyang,Huang Lingling.A global best artificial bee colony algorithm for global optimization[J]. Journal of Computational and Applied Mathematics, 2012, 236:2741-2753.
  • 7Gao Weifeng,Liu Sanyang.A modified artificial bee colony algorithm[J].Computers & Operations Research, 2012, 39: 687-697.
  • 8刘雷,王洪国.一种基于蜂群原理的划分聚类算法[J].计算机应用,2011,28(5):1699-1702.
  • 9Yan Xiaohui, Zhu Yunlong.A new approach for data clus- tering using hybrid artificial bee colony algorithm[J].Neuro- computing, 2012,97 : 241-250.
  • 10Han Jiawei, Micheline Kamber. Data mining: concepts andtechniques [ M] . San Francisco: Morgan Kaufmann Publish-ers,2001.

共引文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部