摘要
传统聚类算法存在对初始聚类中心敏感、易陷入局部最优解、及需要事先确定聚类个数等问题。提出一种加入动态惩罚因子的GEP(Gene Expression Programming)自动聚类算法,该算法将惩罚因子与GEP聚类算法结合,不需任何先验知识,可自动进行簇的划分。并进一步提出惩罚因子的动态生成算法,可根据不同数据集动态生成满足其分布特征的惩罚因子,更好地解决孤立点或噪声点的影响。实验选择四组自构造数据集测试惩罚因子对聚类的影响,依据测试结果进行惩罚因子的建模,将该惩罚因子模型应用于标准数据集Iris上。实验结果表明:算法具有较高的效率和精确度。
Various problems such as sensitive selection of initial clustering center, easily falling into local optimal solution, and determining numbers of clusters, still exist in the traditional clustering algorithm. A GEP automatic clustering algorithm with dynamic penalty factors was proposed. This algorithm combines penalty factors and GEP clustering algorithm, and doesn't rely on any priori knowledge of the data set. And a dynamic algorithm was proposed to generate the penalty factors according to the distribution characteristics of different data sets, which is a better solution for the impact of isolated points and noise points. According to four dataset, penalty factors' effect was tested. Base on the result, a formula to generate penalty factors was proposed. The penalty factor calculated from the formula was used in clustering of the standard data set Iris. The experimental result shows that the efficiency and accuracy of the algorithm are good.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2016年第4期806-814,共9页
Journal of System Simulation
基金
2014年度国家星火计划(2014GA780012)
广东省自然科学基金(2014A030313454)