摘要
模糊C均值聚类算法没有考虑各类样本容量因素,当各类样本容量差异较大时,其聚类判决将向小样本类倾斜。提出一种新的聚类算法——均衡模糊C均值聚类,对模糊C均值聚类算法最小化目标函数进行修正,使得改进的目标函数包含了样本容量因素,利用粒子群算法并以样本模糊隶属度为编码对象求解参数优解。从理论上分析了该算法的性质,通过仿真实验验证了所提算法对平衡、不平衡数据集的有效性。
Fuzzy C-means clustering (FCM) is a fast and effective clustering algorithm, but it doesn't consider the difference of the samples size, while the capacities of each class are of large difference, and the decision of FCM will be benificial to the class with less samples. A new clustering algorithm was proposed in the paper and named as equaliza- tion fuzzy C-means clustering(EFCM). The minimum objective function of FCM was modified and the factor of samples size was added in EFCM objective function. The parameter optimal solutions of EFCM were calculated through PSO al- gorithm in which sample fuzzy memberships are seted as coding object. The properties of EFCM were obtained by theo- retical analysis. The effectiveness of EFCM for balansed and unbalanced datasets was proved by simulation experi- ments.
出处
《计算机科学》
CSCD
北大核心
2014年第8期250-253,共4页
Computer Science
基金
国家自然科学基金(61170126)资助
关键词
模糊C均值聚类
样本容量
均衡化
粒子群
全局优解
Fuzzy C-means clustering, Samples size, Equalization, Particle swarm, Global optimal solution