摘要
针对模糊C-均值聚类算法对孤立点、随机初始化的聚类中心比较敏感的问题,将堆叠稀疏自编码与传统模糊C-均值聚类算法相结合,对传统模糊C-均值聚类算法进行了改进。由于堆叠稀疏自编码可以提取原始数据集从低层到高层的特征,而高层的特征通常比原始数据集更能反映待聚类样本的本质特征,用其代替原始数据集进行聚类,有助于提高聚类的效果。利用改进后的算法在UCI的几个标准数据集上进行实验,结果表明改进后的算法是有效可行的。
In order to solve the sensitivity of fuzzy C-means clustering algorithm to the outlier and the randomly initialized clustering center, the stacked sparse autoencoders and traditional fuzzy C-means clustering algorithm are combined to improve the traditional fuzzy C-means clustering algorithm. Because the stacked sparse autoencoders can extract features of the original data set from low-level to high-level, and high-level features can reflect the nature features of the sample data to be clustered better than the original data set, which will help to improve the clustering effect with high-level features instead of the original data. With experimenting on several standard data sets of UCI, it is shown that the improved algorithm is feasible.
出处
《计算机工程与应用》
CSCD
北大核心
2015年第4期154-157,共4页
Computer Engineering and Applications
基金
江苏省高校"青蓝工程"中青年学术带头人培养对象资助项目
安徽省自然科学基金项目(No.1208085MA15)
合肥学院应用数学重点建设学科基金(No.2014xk08)