期刊文献+

基于粒计算的K-medoids聚类算法 被引量:39

New K-medoids clustering algorithm based on granular computing
下载PDF
导出
摘要 传统K-medoids聚类算法的聚类结果随初始中心点不同而波动,且计算复杂度较高不适于处理大规模数据集;快速K-medoids聚类算法通过选择合适的初始聚类中心改进了传统K-medoids聚类算法,但是快速K-medoids聚类算法的初始聚类中心有可能位于同一类簇。为克服传统K-medoids聚类算法和快速K-medoids聚类算法的缺陷,提出一种基于粒计算的K-medoids聚类算法。算法引入粒度概念,定义新的样本相似度函数,基于等价关系产生粒子,根据粒子包含样本多少定义粒子密度,选择密度较大的前K个粒子的中心样本点作为K-medoids聚类算法的初始聚类中心,实现K-medoids聚类。UCI机器学习数据库数据集以及随机生成的人工模拟数据集实验测试,证明了基于粒计算的K-medoids聚类算法能得到更好的初始聚类中心,聚类准确率和聚类误差平方和优于传统K-medoids和快速K-medoids聚类算法,具有更稳定的聚类结果,且适用于大规模数据集。 Traditional K-medoids clustering algorithm has some drawbacks,such as its clustering results being sensitive to initial cluster centers and its deficiency in large datasets.Although the fast K-medoids algorithm overcame the shortcomings of traditional K-medoids,it has the potential disadvantages of selecting the exemplars in the same cluster as initial seeds for different clusters.To overcome the shortcomings of the traditional K-medoids and the fast K-medoids clustering algorithms,a granular computing based K-medoids clustering algorithm was proposed in this paper.The algorithm defined a new similarity function between samples via pooling granularity,where the granules were produced via the equivalence relationship.The density of a granule was defined according to the number of samples in it,after that the K samples closest to the centers of the first K granules were selected as the initial centers for K-medoids clustering algorithm to cluster datasets.The experimental results on the datasets from UCI machine learning repository and on the synthetic datasets all demonstrate that the new granular computing based K-medoids clustering algorithm can find much better initial centers.Its clustering accuracy and its clustering error are better than those of the traditional K-medoids and the fast K-medoids clustering algorithms.It can get much more stable results and can be applied to cluster large datasets.
作者 马箐 谢娟英
出处 《计算机应用》 CSCD 北大核心 2012年第7期1973-1977,共5页 journal of Computer Applications
基金 陕西省自然科学基金资助项目(2010JM3004) 中央高校基本科研业务费专项(GK201102007) 陕西师范大学2011年研究生培养创新基金资助项目(2011CXS029)
关键词 传统K-medoids聚类算法 快速K-medoids聚类算法 粒计算 等价关系 聚类 traditional K-medoids clustering algorithm fast K-medoids clustering algorithm granular computing equivalence relation clustering
  • 相关文献

参考文献11

  • 1KAUFMAN L, ROUSSEEUW P J. Finding groups in data: an introduc- tion to duster analysis [ M]. New York: Wiley, 1990:126 - 163.
  • 2PARK H S, JUN C H. A simple and fast algorithm for K-medoids clustering [ J]. Expert Systems with Applications, 2009, 36(2) :3336 -3341.
  • 3王国胤,张清华,胡军.粒计算研究综述[J].智能系统学报,2007,2(6):8-26. 被引量:111
  • 4ZADEH L A. Fuzzy sets and information granularity [ M]// Fuzzy Sets, Fuzzy Logic and Fuzzy Systems. River Edge, NJ: Word Sei- entitle, 1996:433-448.
  • 5王伦文.聚类的粒度分析[J].计算机工程与应用,2006,42(5):29-31. 被引量:19
  • 6卜东波,白硕,李国杰.聚类/分类中的粒度原理[J].计算机学报,2002,25(8):810-816. 被引量:95
  • 7DINGS F, XU L, ZHU H, et al. Research and progress of cluster algorithms based on granular computing [J]. International Journal of Digital Content Technology and its Applications, 2010, 4(5): 96 - 104.
  • 8夏宁霞,苏一丹,覃希.一种高效的K-medoids聚类算法[J].计算机应用研究,2010,27(12):4517-4519. 被引量:47
  • 9安秋生,沈钧毅,王国胤.基于信息粒度与Rough集的聚类方法研究[J].模式识别与人工智能,2003,16(4):412-417. 被引量:18
  • 10XIE X L, BENI G. A validity measure for fuzzy clustering [ J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13(8): 841-847.

二级参考文献39

共引文献256

同被引文献297

引证文献39

二级引证文献179

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部