期刊文献+

一种基于信息熵和密度的K-means算法的改进 被引量:1

An Improvement of K-means Algorithm Based on Information Entropy and Density
下载PDF
导出
摘要 影响K-means聚类算法的因素主要有聚类个数、初始聚类中心、异常点、相似性度量和聚类评价准则五个方面。本文通过利用信息熵确定属性的权重,从而对欧氏距离进行加权处理,将孤立点从数据集中取出,从而更好得选出聚类中心,然后利用加权欧氏距离公式对数据集进行相应的聚类。实验结果表明,基于信息熵和密度的K-means聚类算法聚类结果更精确。 The factors affecting the K-means clustering algorithm mainly include five aspects:cluster number,initial cluster center,outlier,similarity measure and cluster evaluation criteria.This paper uses the information entropy to determine the weight of the attribute,thus the Euclidean distance is weighted,and the isolated points are taken out from the data set,so that the cluster center is better selected.Then,the data set is clustered by the weighted Euclidean distance formula.The experimental results show that the K-means clustering algorithm based on information entropy and density is more accurate.
作者 谷玉荣 GU Yu-rong(North Automatic Control Technology Institute,Taiyuan Shanxi 030006)
出处 《数字技术与应用》 2018年第12期107-109,112,共4页 Digital Technology & Application
关键词 信息熵 加权欧氏距离 基于信息熵和密度的K-means聚类算法 information entropy weighted Euclidean distance K-means clustering algorithm based on information entropy and density
  • 相关文献

参考文献8

二级参考文献78

  • 1余建桥,张帆.基于数据场改进的PAM聚类算法[J].计算机科学,2005,32(1):165-167. 被引量:15
  • 2袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:152
  • 3Tan Pang-ning,Steinbaeh M,Kumar V.Introduction to data mining[M]. [S.l.] : Addison Wesley, 2005.
  • 4Han Jia-wei,Kamber M.Data mining:Concepts and techniques[M]. [S.l.]:Morgan Kaufmann Publishers,2001.
  • 5孙士保,秦克云.改进的k-平均聚类算法研究[J].计算机工程,2007,33(13):200-201. 被引量:50
  • 6Han J W,Kamber M.Data mining concepts and techniques[M].Singapore:Elesvier Inc,2006:402-404.
  • 7Ye Yunming,Huang Zhexue,Chen Xiaojun,et al.Neighborhood density method for selecting initial cluster centers in K-means clustering[C]∥Proceedings of PAKDD '06:Advances in Knowledge Discovery and Data Mining,10th Pacific-Asia Conference.Singapore:Springer,2006:189-198.
  • 8He Ji,Lan M,Tan C L,et al.Initialization of cluster refinement algorithms:a review and comparative study[C]∥Proceedings of International Joint Conference on Neural Networks.Budapest:[s.n.],2004:297-302.
  • 9Kaufman L.Finding groups in data:an introduction to cluster analysis[M].New York:Wiley,1990:64-75.
  • 10Katsavounidis I,Kuo C,Zhang Zhen.A new initialization technique for generalized lloyd iteration[J].IEEE Signal Processing Letters,1994,1(10):144-146.

共引文献253

同被引文献20

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部