期刊文献+

大规模数据集引力同步聚类 被引量:3

Clustering by gravitational synchronization on large scale dataset
原文传递
导出
摘要 受Kuramoto模型启发,构造一种新的万有引力同步模型,用以解决现有同步聚类算法时间复杂度高的问题,并提出大规模数据集的引力同步聚类算法(LSCGS).首先,使用快速压缩集密度估计(RSDE)算法对大规模数据集进行压缩;然后,通过万有引力同步聚类算法对压缩数据集进行聚类,使用Davies-Bouldin指标自动寻优到最佳聚类数;最后,利用提出的剩余样本聚类(RSC)算法对除压缩集以外的剩余数据进行聚类,可以有效地区分孤立类以及噪声点.通过在大规模人造数据集、UCI真实数据集和图像数据上的实验,验证LSCGS算法的有效性,与传统同步聚类算法相比,聚类的运算成本得到大幅度的降低. Different from the existing synchronization clustering algorithm(Sync) which is recently proposed based on Kuramoto model in physics, and referring to gravitational law, a novel clustering algorithm, called large sample clustering by gravitational synchronization(LSCGS) is proposed for large datasets. Firstly, a large scale dataset is condensed into its reduced dataset by using the reduced set density estimator method. Then, the obtained reduced dataset is clustered by using the proposed gravitational synchronization clustering model with Davies-Bouldin clustering criterion to find out the most suitable clustering results. Finally, the remaining samples in the large dataset are clustered. The proposed method can detect clusters in data of arbitrary shapes, sizes and numbers without any data distribution assumptions. Extensive experiments on the large synthetic dataset, UCI real-world datasets and image segmentations indicate that LSCGS can effectively detect the clusters of the arbitrary shape, and the proposed method achieves high clustering accuracy with lower execution time.
作者 乔颖 王士同 杭文龙 QIAO Ying WANG Shi-tong HANG Wen-long(School of Digital Media, Jiangnan University, Wuxi 214122, China)
出处 《控制与决策》 EI CSCD 北大核心 2017年第6期1075-1083,共9页 Control and Decision
基金 国家自然科学基金项目(61272210 61170122) 江苏省自然科学基金项目(BK20130155)
关键词 大规模数据 快速压缩集密度估计 万有引力 同步聚类 large scale dataset fast reduced set density estimator gravity synchronization clustering
  • 相关文献

参考文献8

二级参考文献134

共引文献56

同被引文献8

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部