期刊文献+

基于维度频率相异度和强连通融合的混合数据聚类算法 被引量:5

Clustering Algorithm for Mixed Data Based on Dimensional Frequency Dissimilarity and Strongly Connected Fusion
下载PDF
导出
摘要 k-Prototypes算法对初始点选取的敏感性导致聚类结果具有随机性,并且忽视样本数据点与聚类集合中已有样本的总体差异.针对此问题,文中提出基于维度频率相异度和强连通融合的混合数据聚类算法,首先通过多次预聚类产生大量子簇,然后根据子簇之间的连通关系,采用强连通融合的策略得到最终的聚类结果.在UCI数据库中3个混合属性数据集上的实验表明,相比k-Prototypes算法及已有的混合属性聚类算法,文中算法具有更好的聚类质量,从而验证文中算法的优越性. The clustering result of k-Prototypes algorithm is unpredictable due to the sensitivity of the initial prototypes selection. Moreover, the whole diversity between data points and clusters is ignored. Therefore, a clustering algorithm based on dimensional frequency dissimilarity and strongly connected fusion is proposed. Plenty of sub-clusters are produced by multiple pre-clustering. According to the connectivity of those sub-clusters, strongly connected fusion is used to generate the final clusters. The proposed clustering algorithm is validated on three different UCI datasets. Meanwhile, it is compared with three mixed data clustering algorithms. The experimental results show that the proposed algorithm can yield better clustering precision and purity.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2016年第1期82-89,共8页 Pattern Recognition and Artificial Intelligence
基金 水利部公益性行业科研专项项目(No.201401044)资助~~
关键词 维度频率相异度 混合属性 聚类 强连通融合 Dimensional Frequency Dissimilarity, Mixed Attribute, Clustering, Strongly Connected Fusion
  • 相关文献

参考文献15

二级参考文献196

共引文献286

同被引文献64

引证文献5

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部