期刊文献+

一种改进的K-Means算法

An Improved K-Means Clustering Algorithm
下载PDF
导出
摘要 传统K-Means对算法使用者有较高的要求,需要明确K值,并确定初始中心点的位置。通过定义、检测并删除离群点,运用Canopy算法辅助确认K值范围和粗略中心点,借助Silhouette评价指标选择最优K值及其对应的聚类结果的方法,对传统K-Means算法进行改进,改进后的算法不需要手工输入K值和初始中心点。验证结果表明:改进的K-Means算法在聚类时,结果稳定准确,且当数据点数量较大时在迭代次数方面略优于传统算法。 The traditional K-Means has a high requirement for the user of the algorithm,need determine the K value and the location of the initial center point, define, detect and delete the outliers, canopy algorithm assists in identifying the range of K values and rough center points; select the optimal K value for Silhouette evaluation index and its corresponding clustering results, improve the traditional algorithm of K-means, the improved algorithm does not need to manually enter the K value and the initial center point.The results show that the improved k-means algorithm is stable and accurate when it is clustered; and when the number of data points is large, the number of iterations is slightly better than the traditional algorithm.
作者 徐立 XU Li(Shangqiu Polytechnic, School of Software, Henan Shangqiu 476100,Chin)
出处 《河北软件职业技术学院学报》 2018年第2期18-20,共3页 Journal of Hebei Software Institute
基金 河南省社科联 河南省经团联调研课题(SKL-2016-2062)
关键词 K-均值聚类算法 离群点 仿真实验 Silhouette指标 K-Means clustering algorithm outlier point similarity calculation Silhouette index
  • 相关文献

参考文献6

二级参考文献39

  • 1余丹.关于查全率和查准率的新认识[J].西南民族大学学报(人文社会科学版),2009,30(2):283-285. 被引量:15
  • 2姚军,赵秀才,衣艳静,陶军.数字岩心技术现状及展望[J].油气地质与采收率,2005,12(6):52-54. 被引量:91
  • 3陆林花,王波.一种改进的遗传聚类算法[J].计算机工程与应用,2007,43(21):170-172. 被引量:26
  • 4McQUEEN J. Some methods for classification and analysis of multivariate observations[ C]//Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967:281 -297.
  • 5AISABTI K, RANKA S, SINGH V. An efficient K-means clustering algorithm[ C]// IPPS/SPDP Workshop on High Performance Data Mining. Orlando, Florida: [s. n.], 1998:9 - 15.
  • 6ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise [ C]// Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland: AAAI, 1996:226 - 231.
  • 7David aha and fellow graduate students at UC irvine [ EB/OL]. [ 2010 -06 -01 ]. http://archive, ics. uci. edu/ml/datasets. html.
  • 8Han J W,Kamber M.Data mining concepts and techniques[M].Singapore:Elesvier Inc,2006:402-404.
  • 9Ye Yunming,Huang Zhexue,Chen Xiaojun,et al.Neighborhood density method for selecting initial cluster centers in K-means clustering[C]∥Proceedings of PAKDD '06:Advances in Knowledge Discovery and Data Mining,10th Pacific-Asia Conference.Singapore:Springer,2006:189-198.
  • 10He Ji,Lan M,Tan C L,et al.Initialization of cluster refinement algorithms:a review and comparative study[C]∥Proceedings of International Joint Conference on Neural Networks.Budapest:[s.n.],2004:297-302.

共引文献124

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部