摘要
为解决聚类问题中簇的个数不易确定的难题,提出一种自动化的聚类方法.该方法针对不确定的簇个数,给出了一种新的粒子表示方法,并利用微粒群算法在完成一次聚类后,再利用kmeans算法重新分配数据对象并计算聚类中心.该方法利用结合凝聚度和分离度概念的轮廓系数来确定簇的个数,并用误差平方和来辅助验证.实验表明,该方法可以找到最佳的簇个数,并可以有效的对数据对象进行聚类.
Clustering is an important technology that can divide data patterns into meaningful groups, but the numberof groups is difficult to be determined. This paper gives an automatic approach, which can determine the numberof groups by using the silhouette coefficient and the sum of the squared errors, and can cluster the data patternsthrough using the particle swarm optimization and k - means. This approach gives a new particle representation anduses the cohesion and separation of the clusters in the silhouette coefficient to determine the number of the clusters.The experiment conducted shows that the proposed approach can help find the optimum number of clusters, and cancluster the data patterns effectively.
出处
《云南民族大学学报(自然科学版)》
CAS
2016年第4期367-371,共5页
Journal of Yunnan Minzu University:Natural Sciences Edition
关键词
聚类
凝聚度
分离度
误差平方总和
微粒群
cohesion
separation
sum of the squared errors
particle swarm optimization