摘要
针对密度峰值聚类算法DPC(clustering by fast search and find of density peaks)时间复杂度高、准确度低的缺陷,提出了一种基于Ball-Tree优化的快速密度峰值聚类算法BT-DPC。算法利用第k近邻度量样本局部密度,通过构建Ball-Tree加速密度ρ及距离δ的计算;在类簇分配阶段,结合k近邻思想设计统计学习分配策略,将边界点正确归类。通过在UCI数据集上的实验,将该算法与原密度峰值聚类算法及其改进算法进行了对比,实验结果表明,BT-DPC算法在降低时间复杂度的同时提高了聚类的准确度。
In order to overcome the deficiencies of clustering by fast search and find of density peaks(DPC)for its high time complexity and low accuracy,an optimized fast density peak clustering algorithm is proposed based on Ball-Tree in this paper(BT-DPC).The algorithm defines local density of a point based on k-nearest neighbor,and constructs a ball tree to accelerate the calculation of the local densityρand the distanceδ.In the cluster allocation stage,the statistical learning allocation strategy is designed based on the k-nearest neighbors idea to classify the boundary points correctly.The experimental result shows that the BT-DPC algorithm can improve the time performance on the basis of increasing clustering quality compared with DPC algorithm and other popular clustering algorithms through the theory analysis and the experiments on several real-world datasets from the UCI machine learning repository.
作者
丁松阳
田青云
DING Songyang;TIAN Qingyun(School of Computer and Information Engineering,Henan University of Economics and Law,Zhengzhou 450046,China)
出处
《计算机工程与应用》
CSCD
北大核心
2021年第20期90-96,共7页
Computer Engineering and Applications
基金
国家自然科学基金(61202285)。
关键词
聚类算法
ball-tree
密度峰值聚类
分配策略
clustering algorithm
ball-tree
clustering by fast search and find of density peaks(DPC)
allocation strategy