期刊文献+

层次聚类算法的改进及分析 被引量:7

ON IMPROVEMENT AND ANALYSIS OF HIERARCHICAL CLUSTERING ALGORITHM
下载PDF
导出
摘要 层次凝聚算法是一个非常有用的聚类算法,它在迭代地凝聚每次接近对直到所有的数据都属于同一个簇。但层次聚类也存在着几个缺点,如聚类时的时空复杂性高;聚类的簇效率低、误差较大等。经验研究表明,大部分HAC算法都有这样一个趋势:除了在谱系图的顶层,所有低层聚类的簇都是比较小的并且很接近于其他的簇,提出了一种改进算法能够减小时空复杂性并能验证其正确性,分析与实验都证明这种方法是非常有效的。 A prominent and useful class of algorithm is hierarchical agglomerative clustering (HAC) which iteratively agglomerates the closest pare until all data points belong to one cluster. However, HAC methods have several drawbacks, such as high time and memory complexities when clustering, insufficient and inaccurate cluster validation, etc. Empirical study shows that most HAC algorithms follow a trend where, except for a number of top levels of the dendrogram, all lower level agglomerate clusters are very small in size and close in proximity to other clusters. Methods are proposed to reduce the time and memory complexities significantly and to make validation very efficient and accurate. Analysis and experiments all prove the effectiveness of the proposed method.
出处 《计算机应用与软件》 CSCD 北大核心 2008年第6期243-244,268,共3页 Computer Applications and Software
关键词 聚类 层次聚类 谱系图 POP Clustering HAC Dendrogram Cluster POP
  • 相关文献

参考文献9

  • 1范明,孟小峰,等.数据挖掘概念与技术.机械工业出版社,2001:223-260.
  • 2郭崇慧,田凤占,靳晓明,等.数据挖掘教程.清华大学出版社,2005:107-138.
  • 3Zhang T,Ramakrishnan R,Livny M. BIRCH: An efficient data clustering method for very large databases. In : Proceedings of ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996 : 103 - 114.
  • 4Guha S, Rastogi R, Shim K. CURE: An efficient clustering algorithm for large databases. In : Proceedings of the ACM SIGMOD International Conference on Management of Data, 1998:73 -84.
  • 5Day W H E, Edelsbrunner H. Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification, 1984 ( 1 ) : 7 - 24.
  • 6Anderberg M R. Cluster Analysis for Applications. Academic Press, New York,1973.
  • 7Karypis G,Han E H,Kurnar V. CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 1999,32:68- 75.
  • 8Duda R O,Hart P E. Pattern Classification and Scene Analysis, chapter: Unsupervised Learning and Clustering. John Wiley & Sons, 1973.
  • 9Dash M, Huan L, Scheuermann P,Tan K L. Fast hierarchical clustering and its validation. Data & Knowledge Engineering, 2003,44 : 109 - 138.

共引文献2

同被引文献46

引证文献7

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部