期刊文献+

基于密度二分法的密度峰值聚类方法 被引量:4

Density peaks clustering method based on density dichotomy
下载PDF
导出
摘要 密度峰值聚类(DPC)方法能够快速地对数据进行聚类,而不管它们的形状和包含它们的空间的维数,近年来得到广泛研究和应用。然而,当各个聚类中心的密度的差异较大,或者同一个类中包含多个密度中心时,DPC计算效果受到影响。针对于此,提出了基于密度二分法的密度峰值聚类方法。首先,求出全部数据平均密度,将数据分为高密度点和低密度点,然后,根据高密度的点的决策图识别出聚类中心后,根据是否存在可达距离的数据点对同类的聚类中心实现合并。最后,根据提出的分配策略,使高密度点和低密度点都分配到合适的聚类中心,从而实现聚类。在多个合成及实际数据集上的实验表明,该方法的聚类效果明显优于已有的DPC方法。 Density Peaks Clustering(DPC)is a famous cluster algorithm for various data, regardless of their shapes or features. It has been widely studied and applied to solve problems in many fields in recent years. However, its clustering effect is reduced when the densities of the cluster centers differ greatly, or there are many peaks of density in a certain cluster. To address it, a density peaks clustering method based on density dichotomies is proposed. Firstly, the global average density of each point is obtained and the data are divided into two groups according to high density and low density.Secondly, it identifies the clustering centers according to the decision diagram of high density points and then merges the clustering centers if it is within reachable distance. Finally, the high density points and the low density points are assigned to the appropriate clustering centers according to the strategy proposed in this paper. Experiments on several synthetic and real datasets show that the clustering results of the proposed algorithm are better than those of existing DPC algorithms.
作者 许朝阳 林耀海 张萍 XU Chaoyang;LIN Yaohai;ZHANG Ping(School of Information Engineering,Putian University,Putian,Fujian 351100,China;College of Computer and Information Sciences,Fujian Agriculture and Forestry University,Fuzhou 350002,China)
出处 《计算机工程与应用》 CSCD 北大核心 2018年第12期138-145,共8页 Computer Engineering and Applications
基金 莆田市科技局项目(No.2015G2011) 福建省自然科学基金(No.2014J01073) 国家自然科学青年科学基金(No.31300473)
关键词 密度峰值聚类 密度二分法 决策图 高密度点 Density Peaks Clustering(DPC) density dichotomy decision diagram high density points
  • 相关文献

参考文献3

二级参考文献35

  • 1杨志恒.基于Ward法的区域空间聚类分析[J].中国人口·资源与环境,2010,20(S1):382-386. 被引量:48
  • 2A. Rodriguez and A. Laio, "Clustering by fast search and find of density peaks", Science, Voi.344, No.6191, pp.1492-1496, 2014.
  • 3United Nations Global Pulse, Big Data for Development: Chal- lenges & Opportunities, http://unglobalpulse.org/, 2012.
  • 4C. Seife, "Big data: The revolution is digitized", Nature, Vol.518, pp.480-481, 2014.
  • 5L. Einav and J. Levin, "Economics in the age of big data", Science, Vol.346, No.6210, pp.715, 2014.
  • 6E.E. Schadt, M.D. Linderman, J. Sorenson, L. Lee and G.P. Nolan, "Computational solutions to large-scale data manage- ment and analysis", Nature Reviews Genetics, Vol.ll, pp.647- 657, 2010.
  • 7S.L. Wang, W.Y. Gan, D.Y. Li and D.R. Li, "Data field for hierarchical clustering", International Journal of Data Ware- housing and Mining, Vol.7, No.2, pp.43-63, 2011.
  • 8A. Rajaraman and J.D. Ullman, Mining of Massive Datasets, Cambridge University Press, London, UK, 2011.
  • 9R. Xu and D. Wunsch, "Survey of clustering algorithms", IEEE Transactions on Neural Networks, Vol.16, No.3, pp.645-678, 2005.
  • 10C.C. Aggarwal and C.K. Reddy, Data Clustering: Algorithms and Applications, CRC Press, New York, USA, 2014.

共引文献78

同被引文献44

引证文献4

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部