期刊文献+

基于多球分裂的增量式k-means聚类算法 被引量:2

Incremental k-means clustering algorithm based on multi-sphere splitting
原文传递
导出
摘要 针对Ball k-means(BKM)算法对初始化中心敏感、求解精度不高的问题,提出了一种基于多球分裂的增量式k-means聚类算法。该算法利用BKM需要记录各球聚类半径的特点,从给定的初始聚类中心个数开始,按照固定步长一次性产生多个增量聚类中心,直至得到k个初始中心。本文算法避免了所有n个s维数据之间的距离计算,在不增加空间复杂度的前提下将传统增量聚类中心选取的计算量由O(n^(2)s)降低至O(k log k)(k<n)。在8组UCI数据上与k-means、BKM、IK-+以及3个典型增量聚类算法的对比实验结果表明,本文算法降低了BKM的初始化敏感性,解的精度提升了37.15%~66.92%,是对比增量算法中唯一能够提升k-means效率的算法,且随着聚类个数的增加,效率优势更加明显。 To address the problem that the Ball k-means(BKM)algorithm is sensitive to the initialized centers and the solution accuracy is not high,an incremental k-means clustering algorithm based on multiball splitting is proposed.The algorithm takes advantage of the feature that BKM needs to record the clustering radius of each ball,starts from a given number of initial clustering centers,and generates multiple incremental clustering centers at once according to a fixed step until k initial centers are obtained.The algorithm avoids the calculation of distances between all n s-dimensional data and reduces the computation of traditional incremental clustering center selection from O(n^(2)s)to O(k log k)(k<n)without increasing the space complexity.The experimental results of comparison with k-means,BKM,IK-+and three typical incremental clustering algorithms on 8 UCI datasets show that the proposed algorithm reduces the initialization sensitivity of BKM and improves the solution accuracy by 37.15%~66.92%;it is the only algorithm among the comparative incremental algorithms that can improve the efficiency of k-means,and the efficiency advantage becomes more obvious as the number of clusters increases.
作者 曲福恒 钱超越 杨勇 陆洋 宋剑飞 胡雅婷 QU Fu-heng;QIAN Chao-yue;YANG Yong;LU Yang;SONG Jian-fei;HU Ya-ting(College of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China;College of Education,Changchun Normal University,Changchun 130032,China;College of Information Technology,Jilin Agricultural University,Changchun 130118,China)
出处 《吉林大学学报(工学版)》 EI CAS CSCD 北大核心 2022年第6期1434-1441,共8页 Journal of Jilin University:Engineering and Technology Edition
基金 吉林省教育厅“十三五”科学技术项目(JJKH20181164KJ) 吉林省教育厅科学技术研究项目(JJKH20220330KJ) 国家自然科学基金项目(41671397)。
关键词 模式识别 K-MEANS 增量聚类 ball k-means 多球分裂 pattern recognition k-means incremental clustering ball k-means multi-ball splitting
  • 相关文献

参考文献4

二级参考文献27

共引文献177

同被引文献13

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部