摘要
针对大部分聚类算法无法高效地发现任意形状及不同密度的簇的问题,提出了一种高效的基于距离关联性动态模型的聚类改进算法。首先,为提高聚类效率,使用层次聚类算法对数据集进行初始聚类,并剔除样本点含量过低的簇;其次,为发现任意形状及不同密度的簇,以初始聚类结果的簇的质心作为代表点,利用距离关联性动态模型进行聚类,并利用层次聚类的树状结构进行有效的剪枝计算;最后,检验算法的有效性。实验采用Chameleon数据集进行测试,结果表明,该算法能够有效识别任意形状及不同密度的簇,且与同类算法相比,时间效率有显著的提高。
In view of the fact that most of clustering algorithms fail to find arbitrary shaped and different density clusters efficiently, this paper proposes an efficient clustering improved algorithm based on distance-relatedness dynamic model. Firstly, in order to improve the efficiency of clustering, using hierarchical clustering algorithms for the data set to get the initial clusters and remove abnormal clusters. Secondly, in order to obtain arbitrary shaped clusters, taking the centroid of initial clusters as the representative point of all points in it, then running the distance- relatedness dynamic model for clustering, and using the tree structure of hierarchical clustering for pruning. Finally, verifying the effectiveness of the proposed algorithm. The algorithrn is tested on the Chameleon dataset, the experi- mental results show that the algorithm can obtain arbitrary shape and different density clusters, and compared with the same algorithms, the time efficiency is improved significantly.
出处
《计算机科学与探索》
CSCD
北大核心
2016年第2期248-256,共9页
Journal of Frontiers of Computer Science and Technology
基金
江苏省自然科学基金No.BK20140192
中国矿业大学青年科技基金项目No.2013QNB16~~
关键词
聚类
任意形状的簇
不同密度的簇
距离关联性
动态模型
clustering
arbitrary shaped clusters
different density clusters
distance-relatedness
dynamic model