摘要
密度峰聚类是一种新的基于密度的聚类算法,该算法不需要预先指定聚类数目,能够发现非球形簇.针对基于欧氏距离的密度峰聚类算法无法有效处理复杂结构数据集的缺陷,提出了基于密度自适应距离的密度峰聚类算法:首先,基于欧氏距离和自适应相似度计算密度自适应距离,包括局部密度自适应距离和全局密度自适应距离,以更好地描述数据空间分布结构;其次,将密度自适应距离应用到密度峰聚类算法中,得到新算法.在人工数据集和UCI真实数据集上的实验表明,新算法不仅能够有效处理复杂结构数据集,而且具有更高的准确率.
Density peaks clustering is a new density based clustering algorithm. It does not need to specify the number of clusters in ad- vance and can find non-spherical clusters. Aiming at the problem that density peaks clustering based on the Euclidean distance can not effectively deal with complex structure data sets, we proposed a density peaks clustering based on density adaptive distance:Firstly, density adaptive distance, which includes local density adaptive distance and global density adaptive distance, is calculated according to the Euclidean distance and adaptive similarity to better describe the data spatial distribution structure; Secondly, the density adaptive distance is applied to the density peaks clustering. Experiments on artificial data sets and UCI real data sets show that the new algo- rithm not only can effectively deal with the complex structure data sets,but also has a higher accuracy.
出处
《小型微型计算机系统》
CSCD
北大核心
2017年第6期1347-1352,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61402203)资助
江苏省普通高校研究生科研创新计划项目(KYLX15_1169)资助
江苏高校优势学科建设工程项目资助
关键词
聚类
密度峰聚类
自适应相似度
密度自适应距离
clustering
density peaks clustering
adaptive similarity
density adaptive distance