摘要
分析了几个常用的聚类有效性指标的特点,得出它们在聚类有效性设计中必须遵守的规律.基于这些规律以及类内距、类间距及噪声类的相互关系,提出一组新的有效性指标.它们对最优聚类数的计算效果优于现有的结果.特别是适合于含有任意形状和密度不均匀类的数据集的聚类效果评价.通过试验对这组有效性指标做了进一步对比,得到一些新的结果.
We examined several popular indexes of cluster validity and showed the laws that they must obey. A group of new indexes of cluster validity was presented to calculate the lower and upper bound of real cluster number in dataset. The group of indexes of cluster validity can be performed efficiently in dataset with arbitrary-shaped and density-skewed clusters. Two experiments were used to verify the effectiveness of the design in this paper.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2004年第4期516-522,共7页
Pattern Recognition and Artificial Intelligence
基金
国家"863"计划重点资助项目(No.2002AA412010-12)
关键词
聚类分析
有效性
类内距
类间距
Cluster Analysis
Validity
Intradistance
Interdistance