期刊文献+

基于簇间相似度判定的自适应K均值算法 被引量:1

Self-adaptive K-means algorithm based on determination of similarity between clusters
下载PDF
导出
摘要 针对传统K-均值聚类算法需要事先确定聚类数,以及对初始质心的选择具有敏感性,从而容易陷入局部极值点的缺陷,定义了簇间相似度度量对传统K-均值聚类进行改进。新算法可以在事先不确定K值的情况下,根据欧氏距离选取初始质心并按照K均值算法聚类,然后过滤噪声样本并确定簇半径,计算簇间相似度并合并相似簇确定数据集的类别数并得到较优的聚类结果。通过在UCI数据集的实验结果表明,新算法能准确确定类别数并有高于传统K均值算法聚类精度。 The traditional K-means clustering algorithm has two drawbacks.One is that the number of clusters must be known in advance and the other is that the clustering result is sensitive to the selection of initial cluster centroids and this may make the algorithm converge to the local optima.An improved K-means based on the definition of a similarity measure between clusters is brought forward.Although the value of K is unknown,the new algorithm can determine the number of classes and supply a pretty good clustering result through the following steps:Select the initial center of mass,K-means clustering,filtering noising sample and calculate the similarity matrix between clusters and merge the similar clusters.The experimental results on UCI data sets show that the new method could accurately determine the number of classes and get a better clustering accuracy.
作者 陈杰 朱娟
出处 《计算机工程与设计》 CSCD 北大核心 2010年第10期2270-2272,2375,共4页 Computer Engineering and Design
关键词 半聚类 K均值算法 基本簇 簇间相似度 簇合并 clustering K-means basic cluster similarity between clusters cluster merger
  • 相关文献

参考文献8

二级参考文献70

  • 1刘静,钟伟才,刘芳,焦李成.免疫进化聚类算法[J].电子学报,2001,29(z1):1868-1872. 被引量:43
  • 2刘健庄,谢维信,黄建军,李文化.聚类分析的遗传算法方法[J].电子学报,1995,23(11):81-83. 被引量:27
  • 3钱云涛,谢维信.一种由模糊逻辑神经元网络实现的聚类分析方法[J].西安电子科技大学学报,1995,22(1):1-7. 被引量:12
  • 4钱云涛,谢维信.聚类神经网络的通用设计方法[J].西安电子科技大学学报,1997,24(1):15-21. 被引量:3
  • 5HanJ KamberM.数据挖掘概念与技术[M].北京:机械工业出版社,2002..
  • 6Ester M, Kriegel H P, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. of the 2nd ACM SIGKDD, Portland, 1996:226 - 231.
  • 7Sander J, Ester M, Kriegel H P, Xu X. Denslty-based clustering in spatial databases: the algorithm GDBSCAN and its applications.Data Mining and Knowledge Discovery, 1998, 2(2): 169 - 194.
  • 8Ankerst M, Breunig M, Kfiegel H P, Sander J. OPTICS: Ordering points to identify clustering structure. In Proc. of the ACM SIGMOD Conference, Philadelphia, PA, 1999:49 - 60.
  • 9Xu X, Ester M, Kiegel H P, Sander J. A distribution-based clustering algorithm for mining in large spatial databases. In Proc.of the 14th ICDE, Orlando, FL, 1998:324 - 331.
  • 10Hinneburg A, Keim D. An efficient approach to clustering large multimedia databases with noise. In Proe. of the 40th ACMSIGKDD, New York, NY, 1998:58 - 65.

共引文献239

同被引文献5

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部