摘要
如何确定聚类数目一直是聚类分析中的难点问题.为此本文提出了一种基于动态遗传算法的聚类新方法,该方法采用最大属性值范围划分法克服划分聚类算法对初始值的敏感性,并运用两阶段的动态选择和变异策略,使选择概率和变异率跟随种群的聚类数目一致性变化,先进行不同聚类数目的并行搜索,再获取最优的聚类中心.七组数据聚类实验证明该方法能够实现数据集最佳划分的自动全局搜索,同时搜索到最佳聚类数目和最佳聚类中心.
How to determine the number of clusters is always a difficult problem in data cluster analysis. Therefore, a novel dynamic genetic clustering algorithm (DGCA) is proposed in this paper. The DGCA adopts a maximum attribute range partition method to overcome the sensitiveness to initial values of cluster centers for clustering algorithms. Furthermore, the two-stage dynamic selection and mutation operations are used in the DGCA to make selection probability and mutation probability vary with the consistency of the number of clusters in the population. Firstly the parallel search in different numbers of clusters is carried out. Then the optimal search for the best cluster centers is conducted. Numerical experiments on seven data sets show that the proposed DGCA can realize the global search for the best partition and find the optimal values for both the number of clusters and the cluster centers.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2012年第2期254-259,共6页
Acta Electronica Sinica
基金
国家自然科学基金(No.60971004
No.61171088)
上海市自然科学基金(No.10ZR1422400)
上海教委科研创新重点项目(No.09ZZ141)
上海师范大学重点学科项目(No.DZL811)
上海师范大学原创与前瞻性预研项目(No.DYL201006)
关键词
聚类分析
遗传算法
动态选择
变异
cluster analysis
genetic algorithm
dynamic selection
mutation