期刊文献+

基于数据集特点的增强聚类集成算法 被引量:5

Enhanced clustering ensemble algorithm based on characteristics of data sets
下载PDF
导出
摘要 当前流行的聚类集成算法无法依据不同数据集的不同特点给出恰当的处理方案,为此提出一种新的基于数据集特点的增强聚类集成算法,该算法由基聚类器的生成、基聚类器的选择与共识函数构成。该算法依据数据集的特点,通过启发式方法,选出合适的基聚类器,构建最终的基聚类器集合,并产生最终聚类结果。实验中,对ecoli,leukaemia与Vehicle三个基准数据集进行了聚类,所提出算法的聚类误差分别是0.014,0.489,0.479,同基于Bagging的结构化集成(BSEA)、异构聚类集成(HCE)和基于聚类的集成分类(COEC)算法相比,所提出算法的聚类误差始终最低;而在增加候基聚类器的情况下,所提出算法的标准化互信息(NMI)值始终高于对比算法。实验结果表明,同对比的聚类集成算法相比,所提出算法的聚类精度最高,可伸缩性最强。 The popular clustering ensemble algorithms cannot give the appropriate treatment program in the light of the different characteristics of the different data sets.A new clustering ensemble algorithm — Enhanced Clustering Ensemble algorithm based on Characteristics of Data sets(ECECD) was proposed for overcoming this defect.ECECD was composed of generation of base clustering,selection of base clustering and consensus function.It selected a special range of ensemble members to form the final ensemble and produced the final clustering based on the characteristic of the data set.Three Benchmark data sets including ecoli,leukaemia and Vehicle were clustered in the experiment,and the clustering errors gained by the proposed algorithm were 0.014,0.489 and 0.361 respectively,which were always the minimum compared with that of the other algorithms such as Bagging based Structure Ensemble Approach(BSEA),Hybrid Cluster Ensemble(HCE) and Cluster-Oriented Ensemble Classifier(COES).The Normalized Mutual Information(NMI) values of the proposed algorithm were also always higher than that of these algorithms when increasing candidate base clusterings.Therefore,compared with these popular clustering ensemble algorithms,the proposed algorithm has the highest clustering precision and the strongest scalability.
作者 侯勇 郑雪峰
出处 《计算机应用》 CSCD 北大核心 2013年第8期2204-2207,2249,共5页 journal of Computer Applications
基金 山东省企业培训与职工教育课题资助项目(2012-277) 潍坊市社科规划重点课题资助项目(潍社科学术委发[2011]2号) 山东省高校人文社科研究计划项目(J08WG71)
关键词 基聚类器 共识函数 聚类集成算法 聚类误差 自适应性 标准化互信息 base clustering consensus function clustering ensemble algorithm clustering error adaptivity Normalized Mutual Information(NMI)
  • 相关文献

参考文献17

  • 1GIOTIS I, PETKOV N. Cluster-based adaptive metric classification [ J]. Neurocomputing, 2012, 81:33 - 40.
  • 2ANDREW S, KHALED A. Clustering sentence-level text using a novel fuzzy, relational clustering algorithm [ J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(1) : 62 -75.
  • 3KANNAN S R, RAMATHILAGAM S, CHUNG P C, et al. Effec- tive fuzzy c-means clustering algorithms for data clustering problems [J]. Expert Systems with Application, 2012,39(7):6292-6300.
  • 4WOLOSZYNSKI T, KURZYNSKI M, PODSIADLO P, et al. A measure of competence based on random classification for dynamic ensemble se- lection [J]. Information Fusion, 2012, 13(3):207-213.
  • 5CHEN J C, WU C-C, CHEN C-W, et al. Flexible job shop schedu- ling with parallel machines using genetic algorithm and grouping ge- netic algorithm[J]. Expert Systems with Application, 2012, 39 (11): 10016-10021.
  • 6KHALEGHI M, FARSANGI M M, NEZAMABADI-POUR H, et al. Pareto-optimal design of damping eontrollers using modified artificial immune algorithm [ J]. IEEE Transactions on Systems, Man and Cy- bemetics: Part C, Applications and Reviews, 2011,41 (2) : 240 - 250.
  • 7PARTALAS I, TSOUMAKAS G, VLAHAVAS I, et al. An ensem- ble uncertainty aware measure for directed hill climbing ensemble pruning [J]. Machine Learning, 2010, 81(3) : 257 -282.
  • 8MAHAJAN M, NIMBHORKAR P, VARADARAJAN K, et al. The planar k-means problem is NP-hard [ J]. Theoretical Computer Sci- ence, 2012, 442:13-21.
  • 9ZHANG S, WONG H S, SHEN Y, et al. Generalized adjusted rand indices for cluster ensembles [ J]. Pattern Recognition, 2012, 45 (6) : 2214 -2226.
  • 10QING C, JIANG J, YANG Z. Normalized information for facial pose detection inside videos [ J] Transactions on Circuits and Systems for Video Technology 20(12) : 1898 - 1902.

同被引文献42

引证文献5

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部