摘要
针对分布式聚类算法DBDC存在的不足,提出一种基于中心点及密度的分布式聚类算法DCUCD。将数据分布计算出的虚拟点作为核心对象,核心对象的代表性随算法的执行次数提高,聚类即是对所有核心对象分类的过程。理论分析和实验结果表明,该算法能有效处理噪声和分布不规则的数据点,时间效率和聚类质量较好。
In order to overcome the shortcomings of the DBDC,a distributed clustering based on centers and density which called DCUCD is proposed.It works based on the centers and the density.The virtual core objects are generated from the distributed data and the quality is better if the algorithm runs more times.Clustering is the same as the process to classify all of the core objects.Theoretical analysis and experimental results testify that DCUCD can effectively deal with the problem of local noise,and discover clusters of arbitrary shape.It can generate high quality clusters and cost a little time.
出处
《计算机工程》
CAS
CSCD
北大核心
2010年第19期56-58,共3页
Computer Engineering
基金
国家自然科学基金资助项目(50604012)
关键词
数据挖掘
分布式聚类
中心点
噪声
data mining
distributed clustering
centers
noise