摘要
在传统的K-均值聚类算法中,聚类数K必须事先给定,然而,实际中K值很难被精确的确定,K值是否合理直接影响着K-均值算法的好坏。针对这个缺点,提出一种优化聚类数算法,根据聚类算法中类内相似度最大差异度最小和类间差异度最大相似度最小的基本原则,构建了距离评价函数F(S,K)作为最佳聚类数的检验函数,建立了相应的数学模型,并通过仿真实验进一步验证了新算法的有效性。
In traditional K-means algorithm,the class number must be confirmed in advance.However,it can not be clearly and easily confirmed in fact for its uncertainty.Whether the class number is optimized has a direct impact on the performance k-means algorithm.Considering this defection,a new improved algorithm is proposed.According to the basic principles of clustering algorithm that the Within-class similarity is Maximum and the within-class difference is least,the inter-class difference is maximum and the inter-class similarity is least,a distance cost of function F(S,K) to confirm the optimal class number is recommended in this paper.A corresponding math model is set up,and example results further verify the effectiveness of the new algorithm.
出处
《四川理工学院学报(自然科学版)》
CAS
2012年第2期77-80,共4页
Journal of Sichuan University of Science & Engineering(Natural Science Edition)
基金
广西科学基金项目(0640067)
广西研究生教育创新计划项目(2007106020812M73)