期刊文献+

一种基于最大最小距离和SSE的自适应聚类算法 被引量:45

Adaptive clustering algorithm based on maximum and minimum distances,and SSE
下载PDF
导出
摘要 K均值聚类是一种常用的聚类算法,需要指定初始中心和簇数,但随意指定初始中心可能导致聚类陷入局部最优解,且实际应用中簇数未必是已知的。针对K均值聚类的不足,文中提出了一个自适应聚类算法,该算法基于数据实例之间的最大最小距离选取初始聚类中心,基于误差平方和(SSE)选择相对最稀疏的簇分裂,并根据SSE变化趋势停止簇分裂从而自动确定簇数。实验结果表明,该算法可以在不增加迭代次数的情况下得到更准确的聚类结果,验证了所提聚类算法是有效的。 The K-means clustering algorithm, one of the most common clustering algorithms, requires to specify the initial centers and the number of clusters. However, specifying the initial centers can random- ly incur the local optimum of the clustering, and the number of clusters is not known in practice. To solve these problems, this paper proposes an adaptive clustering algorithm. The algorithm can select initial cen- ters based on maximum and minimum distances between data instances, and the most sparse cluster based on the sum of squared based on the changing error (SSE) to split, and determine the number of clusters when to stop splitting trend of SSE, thus identifying the number of clusters automatically. Experimental results show that the proposed algorithm can generate more accurate clustering results without increasing the number of iterations, thus it verifies the effectiveness of the proposed clustering algorithm.
出处 《南京邮电大学学报(自然科学版)》 北大核心 2015年第2期102-107,共6页 Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基金 国家自然科学基金(61170322 71171117 61373065)资助项目
关键词 K均值聚类算法 最大最小距离 初始中心 误差平方和 K-means clustering algorithm maximum and minimum distances initial centers sum ofsquared errors
  • 相关文献

参考文献15

  • 1谭旁宁,STEINBACHM, KUMAR V.数据挖掘导论[M],北京:人民邮电出版社,2012.
  • 2HARTIGAN J A. Clustering Algorithms[ M] . New York:John Wiley & Sons, 1975.
  • 3HAN J, KAMBER M, PEI J. Data Mining Concepts andTechniques Orlando[ M]. San Francisco: Morgan Kaufmann Publishers ,2001.
  • 4MACQUEEN J. Some methods for classification and analy-sis of multivariate observations [C] // Proceedings of the5th Berkeley Symposium on Mathematical Statistics andProbability. 1967 : 281 -297.
  • 5BALL G H,HALL D J. A Clustering Technique for Sum-marizing Multivariate Data [ J ]. Behavior Science,1967,12(2) :153 -155.
  • 6REZAEE M R,LELIEVELDT B P F,REIBER J H C. ANew Cluster Validity Index for the Fuzzy C-Means [ J ].Pattern Recognition Letters, 1998,19(3/4) :237 - 246.
  • 7张忠平,王爱杰,柴旭光.简单有效的确定聚类数目算法[J].计算机工程与应用,2009,45(15):166-168. 被引量:23
  • 8BANDYOPADHYAY S,MAUUK U. Genetic clustering forautomatic evolution of clusters and application to imageclassification [ J ]. Pattern Recognition,2002, 35 ( 6 ):1197-1208.
  • 9XU L,KRZYZAK A, OJA E. Rival penalized competitivelearning for clustering analysis,RBF net,and curve detec-tion[ J]. IEEE Transactions on Neural Networks, 1993 ,4(4):636-649.
  • 10PELLEG D, MOORE A. X-means : Extending K-meanswith efficient estimation of the number of clusters[ C]Proceeding of the 17th International Conference on Ma-chine Learning. 2000 ; 727 - 734.

二级参考文献30

共引文献211

同被引文献446

引证文献45

二级引证文献566

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部