摘要
聚类是大数据时代对海量数据进行数据挖掘与分析的重要工具。本文基于密度峰值聚类算法提出了针对高维数据的聚类模型,以直接简单的形式实现六维度以上数据的任意形状聚类。该模型实现了自动预处理过程,以局部密度较大且距离其他局部密度较大点较远的点作为聚类中心,最后引入参数调整。实验结果表明,该模型不仅对低维数据聚类实用,在高维数据的聚类效果也非常显著。
Clustering is an important tool for data mining and analysis for massive data in big data era. This paper proposes a clustering model in terms of high dimension data based on density peak cluster algorithm and realizes clustering data above six dimension with arbitrary shape simply and directly. This model achieves automatically pre - process and takes local points with larger density and far away from other local points as the clustering center followed by introducing the fine - tuning. Experimental results suggest that our model not only work for low dimension data, but also achieving promising performance for high dimension data.
出处
《中国传媒大学学报(自然科学版)》
2016年第5期29-32,36,共5页
Journal of Communication University of China:Science and Technology
关键词
高维
密度峰值
聚类中心
数据挖掘
high dimension
density peak
clustering center
data mining