摘要
聚类有效性函数是用于评价聚类结果优劣的指标,准确地给出初始聚类类别数将使得聚类结果趋于合理化。根据模糊不确定性理论及聚类问题的基本特性,引入了新的紧密度度量指标Di(U;c),在此基础上提出了一个旨在寻求最优聚类类别数的有效性函数。该函数基于数据集的紧密度与分离度特征,综合考虑了数据成员的隶属度及数据集的几何结构。实验结果表明该有效性函数能够发现最优的聚类类别数,对于分类结构较为明确的数据集表现出良好的性能,并且对于权重系数具有良好的鲁棒性。
Cluster validity index is used to evaluate the validity of clustering.The clustering result will tend to be more logical on the condition that the initial clustering number is accurately ascertained.According to the basic theory of fuzzy indetermination and the properties of clustering,a new cluster validity function is proposed to identify the optimal cluster number based on the newly introduced index Di(U;c) that can measure the clustering compactness.Both the geometry structure of dataset and the membership degree are taken into account in the validity function,which based on the properties of clustering compactness and separation.The experimental results indicate that the new validity function can find out the only cluster number if the dataset has the obvious cluster trend and it is also non-sensitive to the weighting coefficient m.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第6期124-126,132,共4页
Computer Engineering and Applications
关键词
模糊聚类
聚类有效性
模糊C均值
聚类紧密度
聚类分离度
fuzzy clustering
clustering validity
fuzzy c-means
clustering eompactness
clustering separation