摘要
Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.
基金
The National Natural Science Foundation of China(No.62262011)
the Natural Science Foundation of Guangxi(No.2021JJA170130).