摘要
针对K-means算法对初始聚类中心和噪声敏感的缺点,提出了d-K-means算法(distance&density),在K-means算法的基础上权衡了密度和距离对聚类的影响,对数据进行加权处理,在权值基础上引入最小最大原则选择初始聚类中心,自动确定类中心个数。实验结果表明,d-K-means算法在低维与高维数据上都可以取得较好的聚类效果,并且可以更好地应对低密度区域数据,更好地进行类中心选择。
To deal with the disadvantages of the K-means algorithm that is sensitive to initial clustering center and noise,this paper proposed the d-K-means algorithm. On the basis of the K-means algorithm,it weighed the data considering the impact of density and distance on clustering. It selected the initial clustering center by introducing the min-max principle on the basis of weight,and automatically determined the number of class centers. Experimental results show that d-K-means algorithm can achieve better clustering results on low-dimensional data sets and high-dimensional data sets,and better deal with low-density regional data,and better select class centers.
作者
唐泽坤
朱泽宇
杨裔
李彩虹
李廉
Tang Zekun;Zhu Zeyu;Yang Yi;Li Caihong;Li Lian(College of Information Science&Engineering,Lanzhou University,Lanzhou 730000,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第6期1719-1723,共5页
Application Research of Computers
基金
国家重点研发计划资助项目(2018YFB1003205)
国家自然科学基金资助项目(61300230,61370219)
甘肃省自然科学基金资助项目(1107RJZA188)
甘肃省科技支撑计划资助项目(1104GKCA037)
甘肃省科技重大专项项目(1102FKDA010)。