摘要
提出了一个微粒群优化算法(autoPSO)自动聚类高维数据。autoPSO优化了Davies-Bouldin(DB)有效性函数,并将聚类问题转化为一个界约束的连续函数的优化问题。用一个实数矩阵和一个二进制向量来表示微粒,使得同一迭代中能够表示具有不同聚类数目的划分;并且,在二进制向量的控制下指导相关联的实数矩阵交叉操作,保持算法良好的种群多样性,避免算法早熟收敛。通过高维模拟数据集的实验结果表明,本文算法不需要预设聚类数目k,能够自动正确识别高维数据的聚类。
A particle swarm optimizer for automatic clustering of high-dimensional data without presetting the number of clusters is presented. Using the Davies-Bouldin (DB) index as the objective function, high-dimensional data clustering is formulated as a continuous function optimization prob- lem with bound constraints. In order to encode a variable number of clusters, the proposed algorithm utilizes a real-number matrix and a binary vector for particle representation. A new crossover learn- ing governed by the associated binary 'vector of real-value matrix is to maintain the popula- tion diversity so as to prevent the proposed algorithm from the premature convergence problem. Ex- perimental results of synthetic high-dimensional data sets from a data generator show that the proposed algorithm is able to correctly identify clusters of high-dimensional data without presetting the cluster number of k.
出处
《福建工程学院学报》
CAS
2011年第6期607-612,共6页
Journal of Fujian University of Technology
关键词
自动确定聚类数目
微粒群算法
DB有效性
automatic determination of clusters number
particle swarm optimizer
Davies-Bouldin(DB) index