期刊文献+

基于模糊K-means聚类算法的区域数据智能分析方法 被引量:2

Intelligent analysis method of regional data based on fuzzy K⁃means clustering algorithm
下载PDF
导出
摘要 文中对医疗行业的区域化群体数据挖掘方法进行了研究。通过引入模糊数学理论中的隶属度概念,使得K-means算法在分类时不再按照聚类中心来严格划分数据点,提升了数据点分类的稳定性,使算法在迭代的过程中更容易收敛。同时,引入了Hadoop平台下的MapReduce模型对所提算法进行并行化处理,将数据划分为不同的数据切片,使每个切片在不同的计算节点上完成聚类。以河北某地区的实际医保数据为样本进行了算法测试,在聚类精度上,模糊K-means算法相较于传统算法提升了约8.19%。基于文中搭建的分布式存储系统计算集群,当采用8节点进行并行计算时,算法的Speedup与Scaleup分别为3.6和0.58,通过充分利用每个计算节点的计算资源,有效降低了运行时间成本。 This paper studies the group data mining method of medical industry.By introducing the concept of membership degree in fuzzy mathematics theory,K-means algorithm does not strictly divide data points according to the cluster center in classification,improves the stability of data point classification,and makes the algorithm more easily convergent in the process of iteration.At the same time,the MapReduce model under Hadoop platform is introduced to parallel the proposed algorithm,and the data is divided into different data slices,so that each slice can cluster on different computing nodes.Based on the actual medical insurance data in a certain area of Hebei Province,the algorithm test is carried out.The fuzzy K-means algorithm is improved by about 8.19%in clustering accuracy compared with the traditional algorithm.When eight nodes are used for parallel computing,the Speedup and Scaleup of the algorithm are 3.6 and 0.58 respectively.By making full use of the computing resources of each computing node,the running time cost is effectively reduced.
作者 支建勋 ZHI Jianxun(The First Affiliated Hospital of Hebei North University,Zhangjiakou 075000,China)
出处 《电子设计工程》 2022年第10期46-49,54,共5页 Electronic Design Engineering
基金 河北省人力资源和社会保障研究课题(JRS-2020-3014)。
关键词 模糊数学 K-MEANS 数据挖掘 医疗数据 分布式计算 fuzzy mathematics K⁃means data mining medical data distributed computing
  • 相关文献

参考文献12

二级参考文献64

共引文献143

同被引文献11

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部