期刊文献+

基于医疗数据的聚类挖掘策略研究 被引量:1

Research on Clustering Mining Strategy Based on Medical Data Sets
下载PDF
导出
摘要 基于医疗数据集,研究划分式聚类算法K-medoids。针对该算法随机选取初始聚类中心、收敛速度慢、聚类结果不稳定等问题,提出基于方差的密度优化算法。该算法以样本集的均方差和距离均值为基础,再根据样本集的大小计算样本集的密度半径,在相同密度半径下稠密区域的样本具有较高的密度,通过动态选择不同高密度区域的样本作为初始聚类中心,在进行聚类的过程中通过局部优化,加快收敛速度,解决传统K-medoids存在的缺点。将该优化算法应用在UCI机器学习的医疗数据集上测试聚类效果,实验验证该算法选择的初始聚类中心位于样本集的稠密区域,更符合数据集的原始分布,且在乳腺癌数据集上具有较高的聚类准确率,聚类结果稳定,收敛速度快。 Based on the medical data set,the partitioning clustering algorithm K-medoids is studied. A variance-based density optimization algorithm is proposed to solve the problems of random selection of initial clustering center,slow convergence speed and unstable clustering results in K-medoids algorithm. Based on the mean square deviation and distance mean of the sample set,the density radius of the sample set is calculated according to the size of the sample set. Samples in the dense region with the same density radius have higher density. By dynamically selecting the samples as initial clustering centers from different dense regions,local optimization is adopted in the clustering process to accelerate the convergence speed,so as to solve the shortcomings of traditional K-medoids. In order to test the clustering effect,this algorithm is applied to medical data set of UCI machine learning. The experiment shows that the initial clustering centers selected by the algorithm are located in the dense area of the sample set,which is more in line with the original distribution of the data set. The algorithm has higher clustering accuracy,more stable clustering results and faster convergence speed on breast cancer data sets.
作者 王艳娥 安健 王红刚 丁心安 杨倩 WANG Yan-e;AN Jian;WANG Hong-gang;DING Xin-an;YANG Qian(School of Science and Technology,Xi’an Siyuan University,Xi’an 710038,China;Shenzhen Research Institute of Xi’an Jiaotong University,Shenzhen 518057,China)
出处 《计算机技术与发展》 2020年第7期66-70,共5页 Computer Technology and Development
基金 陕西省教育科学研究计划项目(18JK1100) 深圳市科技计划项目(JCYJ20170816100939373) 陕西省高等教育科学研究项目(XGH19236)。
关键词 医疗数据 K-medoids算法 聚类 密度优化 方差 medical data K-medoids algorithm clustering density optimization variance
  • 相关文献

参考文献6

二级参考文献71

共引文献1097

同被引文献11

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部