期刊文献+

可变阈值的K-Means初始中心选择方法 被引量:8

Approach to selecting initial centers for K-Means with variable threshold
下载PDF
导出
摘要 K-Means算法随机选择聚类中心初始点,导致聚类器性能不稳定。对此,提出基于可变阈值的初始聚类中心选择方法(VTK-Means)。该算法选择距已有初始点距离大于一个阈值的样例作为初始聚类中心,并根据满足条件的初始聚类中心个数适当调整阈值。在10个UCI数据集上的实验结果表明,该算法性能明显优于K-Means算法。 The K-Means algorithm selects the initial clustering centers randomly,which results in the performance of the clustering instability.In order to improve the limitation,a novel clustering algorithm(VTK-Means) based on variable threshold to select initial cluster centers is proposed in this paper.The algorithm tries to select the points whose distances to the existing initial points are longer than a threshold as the initial cluster centers, and then it appropriately adjusts the threshold accord- ing to the number of the points meeting the condition in the first step.The experimental results on UCI machine learning data sets indicate that it yields better stability compared with the typical K-means algorithm.
出处 《计算机工程与应用》 CSCD 北大核心 2011年第32期56-58,共3页 Computer Engineering and Applications
基金 山东省科技研究计划项目(No.2007ZZ17 No.2008GG10001015 No.2008B0026) 山东省教育厅科研项目(No.J09LG02)
关键词 K-MEANS 聚类 可变阈值 初始聚类中心 K-Means clustering variable threshold initial cluster center
  • 相关文献

参考文献13

  • 1Theodoridis S, Koutroumbas K.Pattem recognition[M].[S.1.] : Aca- demic Press, 2006.
  • 2Xu R.Survey of clustering algorithms[J].IEEE Transactions onNeural Networks, 2005,16(3) : 645-678.
  • 3Kang P, Cho S.K-means clustering seeds initialization based on centrality, sparsity, and isotropy[C]//Intelligent Data Engineering and Automated Learning-IDEAL.Heidelberg, Berlin: Springer-Verlag, 2009: 109-117.
  • 4MacQueen J B.Some methods for classification and analysis of multivariate observations[C]//Proc of the 5th Berkeley Sympo- sium on Mathematical Statistics and Probability, 1967: 281-297.
  • 5Wang W, Yang J,Muntz R.STING:a statistical information grid approach to spatial data mining[C]//Proc of the 23rd International Conference on Very Large Data Bases,1997:l-18.
  • 6Agrawal R, Gehrke J, Gunopulcs D.Automatic subspace cluster- ing of high dimensional data for data mining application[C]// Proc of ACM SIGMOD International Conference on Manage- ment of Data, Seattle,WA, 1998:94-105.
  • 7k Guha S, Rastogi R, Shim K.Cure: an efficient clustering algo- rithm for large database[C]//Information Systems,2001,26( 1 ).
  • 8He J,Tan A H,Tan C L.ART-C:a neural architecture for self- organization under constraints[C]//Proc of International Joint Conference on Neural Networks (IJCNN 2002), Hawaii, USA, 2002:2550-2555.
  • 9Kaufman L,Rousseeuw P J.Finding groups in data:an introduc- tion to cluster analysis[C]//Applied Probability and Statistics. New York:Wiley, 1990.
  • 10Khan S S, Ahmad A.Cluster center initialization algorithm for k-means clustering[J].Pattem Recognition Letters, 2004,25 ( 11 ) : 1293-1302.

同被引文献83

引证文献8

二级引证文献55

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部