期刊文献+

一种基于几何概率的聚类有效性函数 被引量:1

A Cluster Validity Function Based on Geometric Probability
下载PDF
导出
摘要 聚类有效性是聚类分析中尚未解决的基本问题,最佳聚类数的确定是聚类有效性问题中的主要研究内容。以几何概率为理论依据,针对2维数据集提出了一种新的聚类有效性函数,用于确定最佳聚类数。该函数利用2维数据集与2维离散点集之间存在的对应关系,以2维离散点集在特征空间中的分布特征为依据,测度对应数据集的聚类结构,思路直观、容易理解。测度过程中,将点集中的点两两相连生成一个线段集合保存点集的结构信息,通过比较线段集合中线段方向取值与完全随机条件下线段方向取值的相对大小,构造聚类有效性函数。实验结果表明,针对给定的样本数据集,生成该函数的曲线,再根据曲线的形态能够有效地确定2维数据集的最佳聚类数,指导聚类算法设计。 Determining optimum cluster number is a key research topic included in Cluster validity, a fundamental unsolved problem in cluster analysis. In order to determine the optimum cluster number, this article proposes a new cluster validity function for two dimensional datasets theoretically based on geometric probability. The function uses of the relationship between a two dimensional dataset and the corresponding two dimensional discrete point set to measure the cluster structure of the dataset according to the distributive feature of the point set in the characteristic space. It' is designed from the perspective of intuition and thus can be easily understood. During the process of measurement, the structure information of the point set has been stored in a line segment set generated by connecting each pair points in the point set. The cluster validity function is formed by comparing the values of line segment direction in the line segment set with those resulted from completely random condition. In the case study, it is testified that the pattern of the function curve generated with a given example dataset effectively enables the determination of the optimum cluster number of the dataset and supports the design of cluster algorithms.
出处 《中国图象图形学报》 CSCD 北大核心 2008年第12期2351-2356,共6页 Journal of Image and Graphics
基金 国家自然科学基金项目(40471113)
关键词 聚类有效性 几何概率 聚类分析 最佳聚类数 Cluster validity, Geometric probability, Cluster analysis, The optimum cluster number
  • 相关文献

参考文献14

  • 1Han J, Kamber M. Data Mining: Concepts and Techniques [ M]. Los Altos, CA, USA: Morgan Kaufmann, 2001.
  • 2Bezdek J C, Pal N R. Some new indexes of cluster validity [ J]. IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybercetics, 1998, 28 ( 3 ) :301 - 305.
  • 3Backer E, Jain A K. A clustering performance measure based on fuzzy set decomposition [ J]. IEEE Transactions on PAMI, 1981, 3(1) :66 -95.
  • 4Windham M P. Cluster validity for the fuzzy c-means clustering algorithm [J]. IEEE Transactions on PAMI, 1982, 4 (4): 357 - 363.
  • 5Al-Sultan K S, Selim S Z. Global algorithm for fuzzy clustering problem [J]. Pattern Recognition, 1993, 26(9) :1357-1361.
  • 6Jain A K,Murty M N,Flynn P J. Data clustering:a review [J]. ACM Computing Surveys, 1999,31 (3) : 265 - 323.
  • 7Nakamura E, Kehtarnavaz N. Determining number of clusters and prototype locations via multi-scale clustering [ J]. Pattern Recognition Letters, 1998, 19 (14): 1265- 1283.
  • 8Pakhiraa M K, Bandyopadhyay S, Maulik U. Validity index for crisp and fuzzy clusters [ J ]. Pattern Recognition, 2004, 37 ( 3 ) : 487 - 501.
  • 9Kim M, Ramakrishna R S. New indices for cluster validity assessment [J]. Pattern Recognition Letters, 2005, 26 (15): 2353 -2363.
  • 10Kim D W, Lee K H, Lee D. On cluster validity index for estimation of optimal number of fuzzy clusters [ J]. Pattern Recognition, 2004, 37(10) : 2009 - 2025.

二级参考文献40

  • 1Bezdek J C. Pattern Recognition with Fuzzy Objective Function Algorithms. New York:Plenum Press, 1981.
  • 2Pal N R, Bezdek J C. On cluster validity for the fuzzy c-mean model. IEEE Transactions on Fuzzy Systems, 1995,3 (3): 370-379.
  • 3Fadili M J, Ruan S, Bloyet D, Mayoyer B. On the number of clusters and the fuzziness index for unsupervised FCA application to BOLD fMRI time series. Medical Image Analysis,2001,5(1) :55-67.
  • 4Yu Jian,Cheng Qian-Sheng, Huang Hou-Kuan. On weighting exponent of the fuzzy c-means model. In: Proceedings of ICYCS2001, Hangzhou, 2001, II : 631- 633.
  • 5Bezdek J C, Hathaway R J, Sabin M J, Tucker W. Convergence theory for fuzzy c-means: Counter-examples and repairs.IEEE Transactions on SMC, 1987,17(5): 873-877.
  • 6Choe H,Jordan J B. On the optimal choice of parameters in a fuzzy c-means algorithm. In: Proceedings of IEEE International Conference on Fuzzy Systems, 1992. 349-354.
  • 7Yi Shen, Hong Shi, Jian Qiu-Zhang. Improvement and optimization of a fuzzy c-means clustering algorithm. In: Proceedings of IEEE Instrumentation and Measurement Technology Conference, Budapest, Hungary, 2001.
  • 8Tucker WT. Couterexamples to the convergence theorem for fuzzy ISODATA clustering algorithm. In: Bezdek J C ed. The Analysis of fuzzy Information, Boca Raton, FL: CRC Press,1987, 3:110-117.
  • 9Baraldi A, Blonda P, Parmiggiani F et al. Model transitions in descending FLVQ. IEEE Transactions on Neural Networks,1998,9(5) :724-737.
  • 10Dave R N, Krishnapuram R. Robust clustering methods: A unified view. IEEE Transactions on Fuzzy Systems, 1997,5 (2) :270-293.

共引文献191

同被引文献15

  • 1岳士弘,李平,于剑.一组新的聚类有效性指标[J].模式识别与人工智能,2004,17(4):516-522. 被引量:5
  • 2淦文燕,李德毅,王建民.一种基于数据场的层次聚类方法[J].电子学报,2006,34(2):258-262. 被引量:82
  • 3于勇前,赵相国,陈衡岳,王国仁.基于引力概念的聚类质量评估算法[J].东北大学学报(自然科学版),2007,28(8):1109-1112. 被引量:3
  • 4Miller H, Han J. Geographic Data Mining and Knowledge Discovery [M]. 2nd ed. Boca Raton: CRC Press, 2009.
  • 5Berry M, Linoff G. Data Mining Techniques for Marketing, Sales and Customer Support[M]. New York: John Wiley & Sons Inc, 1996.
  • 6Fowlkes E, Mallows C. A Method for Comparing Two Hierarchical Clusterings [J].Journal of the American Statistical Association, 1983, 382 (78) : 569-576.
  • 7HalkidiI M, Batistakis Y, Vazirgiannis M. On Clustering Validation Techniques[J]. Intelligent Information Systems, 2001, 223(17) : 107-145.
  • 8Pal N G, Biswas J. Cluster Validation Using Graph Theoretic Concepts[J]. Pattern Recognition, 1997, 30(6) :847-857.
  • 9Kovacs F, Legany C, Babos A. Cluster Validity Measurement Techniques[C]. The 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases, Madrid, Spain, 2006.
  • 10Tobler W. A Computer Movie Simulating Urban Growth in the Detroit Region[J]. Economic Geography, 1970, 46(2): 234-240.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部