期刊文献+

一种基于层次聚类的k均值算法研究 被引量:7

A K-means Clustering Algorithm based on Hierarchy
下载PDF
导出
摘要 依据信息论的思想,对基于层次的K-均值聚类算法(HKMA)过程进行了分析,该算法首先采用层次方法对文档进行初始聚类,得到的聚类总数作为k均值算法中的k值,在此基础上,通过k均值聚类对聚类结果进行修正。实验结果表明,HKMA执行时间整体上优于k-means算法,而且随着数据量的增大执行时间的增长幅度也较小。 Probabilistic hierarchical clustering based on document information quantity.From an information theory angle,we study a K-means clustering algorithm based on hierarchy in this paper.Firstly,this algorithm classifies documents into one or more predefined categories using hierarchical methods,the total classified number is taken for the number of clusters.Secondly,it uses k-means to modify the clustering results.Experimental results showed that these algorithms have higher mining efficiency in execution time,memory usage and CPU utilization than most current ones like k-means.
出处 《微计算机信息》 2010年第12期228-229,232,共3页 Control & Automation
关键词 聚簇 K-MEANS 层次方法 文本挖掘 Cluster k-means hierarchical methods text mining
  • 相关文献

参考文献5

  • 1陈良维.数据挖掘中聚类算法研究[J].微计算机信息,2006(07X):209-211. 被引量:32
  • 2P. S. Bradley and U. M. Fayyad, "Refining initial points for K- means clustering", Proceedings of the Fifteenth International Conference on Machine Learning (ICML98), 1998, pp. 91-99.
  • 3The Analysis of a Simple K-Means Algorithm. T. Kanungo, D. M. Mount, N.S. Netanyahu, C. Piatko, R. Silverman and A.Y. Wu. 2000.
  • 4R. Kannan, S. Vempala, and Adrian Vetta, "On Clusterings: Good, Bad, and Spectral", Proc. of the 41st Foundations of Computer Science, Redondo Beach, 2000.
  • 5S. Kantabutra, Efficient Representation of Cluster Structure in Large Data Sets, Ph.D. Thesis, Tufts University, Medford, MA, September 2001.

二级参考文献8

  • 1荆丰伟,刘冀伟,王淑盛.改进的K-均值算法在岩相识别中的应用[J].微计算机信息,2004,20(7):41-42. 被引量:5
  • 2韩家炜 Michelin K.数据挖掘:概念与技术[M].北京:机械工业出版社,2001..
  • 3A. K. J ain , R. C. Dubes. Algorithm for Clustering Data[C].Prentice Hall, 19881
  • 4Kanungo T, Mount DM, Netanyahu NS. An efficient k-menas clustering algorithm: analysis and implementation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24(7):881-892.
  • 5Kohonen T. The Self-Organizing Maps[J]. Proceedings of the IEEE, 1990,78(9):1464-1480.
  • 6Kohonen T. Self organization of a massive document collection[EB/OL].http://lib.hut.fgDiss/2000/isbn95122.52600/articl -e7.pdf,2000.
  • 7王实,高文,李锦涛.Web数据挖掘[J].计算机科学,2000,27(4):28-31. 被引量:119
  • 8汤效琴,戴汝源.数据挖掘中聚类分析的技术方法[J].微计算机信息,2003,19(1):3-4. 被引量:87

共引文献31

同被引文献65

引证文献7

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部