期刊文献+

K-均值算法中聚类个数优化问题研究 被引量:17

Optimization Study on Class Number of K-means Algorithm
下载PDF
导出
摘要 在传统的K-均值聚类算法中,聚类数K必须事先给定,然而,实际中K值很难被精确的确定,K值是否合理直接影响着K-均值算法的好坏。针对这个缺点,提出一种优化聚类数算法,根据聚类算法中类内相似度最大差异度最小和类间差异度最大相似度最小的基本原则,构建了距离评价函数F(S,K)作为最佳聚类数的检验函数,建立了相应的数学模型,并通过仿真实验进一步验证了新算法的有效性。 In traditional K-means algorithm,the class number must be confirmed in advance.However,it can not be clearly and easily confirmed in fact for its uncertainty.Whether the class number is optimized has a direct impact on the performance k-means algorithm.Considering this defection,a new improved algorithm is proposed.According to the basic principles of clustering algorithm that the Within-class similarity is Maximum and the within-class difference is least,the inter-class difference is maximum and the inter-class similarity is least,a distance cost of function F(S,K) to confirm the optimal class number is recommended in this paper.A corresponding math model is set up,and example results further verify the effectiveness of the new algorithm.
作者 韩凌波
出处 《四川理工学院学报(自然科学版)》 CAS 2012年第2期77-80,共4页 Journal of Sichuan University of Science & Engineering(Natural Science Edition)
基金 广西科学基金项目(0640067) 广西研究生教育创新计划项目(2007106020812M73)
关键词 K-均值算法 聚类个数 距离价值函数 K-means algorithm clustering center distance cost
  • 相关文献

参考文献14

  • 1Tan Pangning,Michael Steinbach,Vipin Kumar.In-troduc- tion to Data Mining[M].Addison Wesley.2005.
  • 2Ramze R M, Lelieveldt B P F, Reiber J H C. A new cluster validity indexes for the fuzzy c-mean[J].Pattem Recognition Letters,1988,19:237-246.
  • 3杨善林,李永森,胡笑旋,潘若愚.K-MEANS算法中的K值优化问题研究[J].系统工程理论与实践,2006,26(2):97-101. 被引量:187
  • 4于剑,程乾生.模糊聚类方法中的最佳聚类数的搜索范围[J].中国科学(E辑),2002,32(2):274-280. 被引量:130
  • 5李永森,杨善林,马溪骏,胡笑旋,陈增明.空间聚类算法中的K值优化问题研究[J].系统仿真学报,2006,18(3):573-576. 被引量:39
  • 6孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1060
  • 7Calinski R, Harabasz J. A dendrite method for cluster an.alysis[J].Communications in Statistics,1974,3(1):27.
  • 8Kapp A V, Tibshirani R. Are clusters found in one dataset present in another dataset [J]. Biostati-stics, 2007,8(1 ):9-31.
  • 9Frey B J,Dueck D.Response to comment on" clustering by passing messages between data points" [J]. Science, 2008,319.
  • 10Frey B J, dueck D. Clustering by passing messages between data points [J]. science,2007,315:972-976.

二级参考文献42

  • 1李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:113
  • 2Treshansky A,McGraw R.An overview of clustering algorithms[A].Proceedings of SPIE,The International Society for Optical Engineering[C].2001(4367):41-51.
  • 3Clausi D A.K-means Iterative Fisher (KIF) unsupervised clustering algorithm applied to image texture segmentation[J].Pattern Recognition,2002,35:1959-1972.
  • 4Bezdek J C,Pal N R.Some new indexes of cluster validity[J].IEEE Transactions on Systems,Man,and Cybernetics _ Part B:Cybernetics,1998,28(3):301-315.
  • 5Ramze R M,Lelieveldt B P F,Reiber J H C.A new cluster validity indexes for the fuzzy c-mean[J].Pattern Recognition Letters,1998,19:237-246.
  • 6Xie X L,IEEE Trans Pattern Anal Mach Intell,1991年,13卷,841页
  • 7Gonzalez T. Clustering to Minimize and Maximum Intercluster Distance. Theoretical Computer Science, 1985,38: 293 - 306
  • 8Pal N R,Bezdek J C. On Cluster Validity for the Fuzzy C-Mean Model. IEEE Transactions on Fuzzy Systems [J], 1995. 370-390
  • 9Xie X, Beni G. A Validity Measure for Fuzzy Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) ,1991, 13(8) :841-847
  • 10Bensaid A M. Validity-Guided (Re) Clustering with Applications to Image Segmentation. IEEE Transactions on Fuzzy Systems,1996,4(2)

共引文献1450

同被引文献151

引证文献17

二级引证文献95

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部