期刊文献+

一种基于PDS、TIE和PMI的快速PAM聚类算法 被引量:4

AN EFFICIENT PAM CLUSTERING ALGORITHM BASED ON PDS,TIE AND PMI
下载PDF
导出
摘要 PAM(Partitioning Around Medoids)是一种基于k-中心点的聚类算法,在处理数据集聚类时,具有较强的鲁棒性和准确性。但是,PAM算法的主要缺点是确定聚类中心点集所需的计算代价太高。对于大数据集,PAM聚类过程缓慢。提出一种利用部分距离搜索(PDS),先前中心点标号(PMI),以及三角不等式消除(TIE)准则等搜索策略来降低中心点迭代所需计算复杂性,实现快速PAM聚类的新算法。实验结果表明,相对于基本PAM聚类算法,在保持相同聚类效果的情况下,快速PAM聚类新算法能够减少70%~90%的乘法计算量,并可节省约1/3以上的计算时间。 PAM ( Partitioning Around Medoids) algorithm is one of the popular k-mediod clustering algorithms,which has strong robustness and correctness when processing large datasets. However, PAM clustering algorithm suffers from heavy computational burden in large data set processing. A novel efficient PAM algorithm is proposed, which utilizes Partial Distance Search (PDS), Previous Medoid Index (PMI), and Triangular Inequality Elimination (TIE) Criteria to facilitate distance computation when searching for optimal medoids. Experimental results demonstrate the effectiveness of this algorithm,which may reduce multiplications by from 70% to 90% and save at least 1/3 running time, while retaining exactly the same clustering quality comparing with the basic PAM clustering algorithm.
出处 《计算机应用与软件》 CSCD 北大核心 2008年第9期8-11,共4页 Computer Applications and Software
基金 国家自然科学基金项目(60673082)
关键词 聚类方法 PAM算法 搜索策略(PDS/PMI/TIE) 计算复杂度 Clustering methods PAM algorithm Search strategy(PDS/PMI/TIE) Computation complexity
  • 相关文献

参考文献8

  • 1Chu S C, Roddick J F, Pan J S. An Efficient K-Medoids-Based Algorithm Using Previous Medoid Index, Triangular Inequality Elimination Criteria, And Partial Distance Search. DaWak 2002, LNCS 2454,2002 : 63 - 72.
  • 2Elkan C. Using The Triangle Inequality To Accelerate K-Means. Proceedings of the Twentieth International Conference on Machine Learning, Washington DC,2003.
  • 3Zhang Q P, Couloigner i. A New And Efficient K-Medoid Algorithm For Spatial Clustering. ICCSA 2005, LNCS 3482,2005 : 181 - 189.
  • 4Bei C D,Gray R M. An Improvement of the Minimum Distortion Encoding Algorithm For Vector Quantization. IEEE Transactions on Communication, 1985,33 ( 10 ) : 1132 - 1133.
  • 5Chen S H,Pan J S. Fast Search Algorithm For VQ-Based Recognition Of Isolated Word. Communications, Speech and Vision, IEE Proceedings 1,1989,136(6) :391-396.
  • 6Ng R T, Han J W. CLARANS: A Method For Clustering Objects For Spatial Data Mining. IEEE Transactions on Knowledge and Data Engineering,2002,14(5 ) : 1003 - 1016.
  • 7Kaufman L, Rousseeuw P J. Finding Groups In Data: An Introduction To Cluster Analysis. John Wilsy & Sons,1990.
  • 8Khan S S, Ahmad A. Cluster Center Initialization Algorithm For K- Means Clustering. Pattern Recognition Letters 2004,25:293 - 1302.

同被引文献32

  • 1余建桥,张帆.基于数据场改进的PAM聚类算法[J].计算机科学,2005,32(1):165-167. 被引量:15
  • 2何振峰.一种基于限制的PAM算法[J].计算机工程与应用,2006,42(6):190-192. 被引量:5
  • 3刘亮,王相海.一类工作调度问题的回溯解法[J].计算机工程与设计,2006,27(18):3338-3339. 被引量:8
  • 4张钊,王锁柱,张雨.一种基于SOM和PAM的聚类算法[J].计算机应用,2007,27(6):1400-1402. 被引量:8
  • 5GODDOGER.多核处理器_百度百科[EB/OL].(2009-12-26).http://baike.baidu.com/view/2797908.htm?fr=ala0_1_1.
  • 6TOP500.ORG.TOP500ListHighlights[EB/OL].(2009-11).http://www.top500.org/lists/2009/11/highlights.
  • 7CONSTANTINOU T, SAZEIDES Y, MICHAUD P, et al. Perfor-mance implications of single thread migration on a chip multi-core[J].ACM SIGARCH Computer Architecture News,2005,33(4):80-91.
  • 8MAKHTER S,ROBERT J.多核程序设计技术[M].李宝峰,富弘毅,李韬,译.北京:电子工业出版社,2007.
  • 9HAE S P, CHI H J. A simple and fast algorithm for K-Medoids clustering[J].Expert Systems with Applications,2009,36(2):3336-3341.
  • 10Dataset[EB/OL].(2010).http://archive.ics.uci.edu/ml/datasets.html.

引证文献4

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部