期刊文献+

采用仿射传播的聚类集成算法 被引量:10

Cluster Ensemble Algorithm Using Affinity Propagation
下载PDF
导出
摘要 针对K均值聚类随机初始聚类中心导致的聚类结果不稳定问题,提出一种基于仿射传播的聚类集成算法.该算法把每个聚类集成的成员个体结果看成是原始数据的一个属性,然后在其基础上对聚类成员个体的聚类结果进行加权集成,集成算法采用简单高效的仿射传播聚类,并且提出了直接集成、利用平均规范化互信息(NMI)和聚类有效性Silhouette指标进行加权集成.最后,运用Hungarian算法对仿射传播聚类集成的结果进行类别标签的统一和匹配.在加州大学尔湾分校数据集上进行了实验,结果表明,与集成前的K均值聚类及其他聚类集成算法相比,该算法能有效地提高聚类结果的准确性、鲁棒性和稳定性,建立起来的聚类集成算法具有良好的扩展性和灵活性,而且简单有效. The result of K-means cluster is instable for random initial clustering centers. A cluster ensemble algorithm based on affinity propagation is proposed,where the result of each cluster individual is regarded as a property of the original data. Following the new properties sets, the results of each cluster individual are carried out to a weighted ensemble, and simple and efficient affinity propagation cluster is chosen in the ensemble algorithm. Furthermore the direct ensemble, the ensemble to weighted ensemble from average normalized mutual information (NMI) and cluster validation indexes Silhouette are uniformly proposed. Finally, Hungarian algorithm is employed to unify and match the category labels for the results of affinity propagation cluster. The results of experiments on University of California Irvine data sets show the higher efficiency for improving the accuracy, robustness and stability of cluster results than the K means clustering before combination and the other clustering ensemble algorithms. The clustering ensemble algorithm gets more extendable and flexible.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2011年第8期1-6,共6页 Journal of Xi'an Jiaotong University
基金 国家自然科学基金资助项目(60673024) 高等学校博士学科点专项科研基金资助项目(20100201110063) 国防“十一五”预研资助项目
关键词 仿射传播 加权集成 K均值聚类 Hungarian算法 affinity propagatiom weighted cluster ensemble~ K means cluster~ Hungarian algo-rithm
  • 相关文献

参考文献19

  • 1XU R, WUNSCH D. Survey of clustering algorithms [J].IEEE Transactions on Neural Networks, 2005, 16 (3):645-678.
  • 2OMRAN M G H, ENGELBRECHT A P, SALMAN A. An overview of clustering methods[J]. Intelligent Data Analysis, 2007, 11 (6): 583-605.
  • 3孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1069
  • 4MACQUEEN J. Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, California,USA: University of California Press, 1967: 281-297.
  • 5徐森,卢志茂,顾国昌.解决文本聚类集成问题的两个谱算法[J].自动化学报,2009,35(7):997-1002. 被引量:20
  • 6FRED A, JAIN A. Combining multiple clusterings using evidence accumulation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27 (6) : 835-850.
  • 7ZHOU Z H, TANG W. Clusterer ensemble [J ]. Knowledge-Based Systems, 2006, 19 (1): 77-83.
  • 8罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量:36
  • 9STREHL A, GHOSH J. Cluster ensembles: a knowledge reuse framework for combining multiple partitions [J]. The Journal of Machine Learning Research, 2002 (3) : 583-617.
  • 10王红军,李志蜀,成飏,周鹏,周维.基于隐含变量的聚类集成模型[J].软件学报,2009,20(4):825-833. 被引量:14

二级参考文献23

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:114
  • 3TIAN Zheng,LI XiaoBin,JU YanWei.Spectral clustering based on matrix perturbation theory[J].Science in China(Series F),2007,50(1):63-81. 被引量:19
  • 4罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量:36
  • 5Hadjitodorov S T,Kuncheva L I,Todorova L P.Moderate diversity for better cluster ensembles.Information Fusion,2006,7(3):264-275
  • 6MacKay D J C.Information Theory,Inference and Learning Algorithms.Cambridge:Cambridge University Press,2003
  • 7Strehl A,Ghosh J.Cluster ensembles-A knowledge reuse framework for combining multiple partitions.Journal of Machine Learning Research,2002,3(3):583-617
  • 8Hubert L,Arabie P.Comparing partitions.Journal of Classification,1985,2(1):193-218
  • 9Giacinto G,Roli F.Design of effective neural network ensembles for image classification processes.Image Vision and Computing Journal,2001,19(9/10):699-707
  • 10Partridge D,Krzanowski W J.Software diversity:Practical statistics for its measurement and exploitation.Information and Software Technology,1997,39(10):707-717

共引文献1121

同被引文献84

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2赵刚,江平宇.基于加权有向图的零件聚类模型研究[J].计算机集成制造系统,2006,12(7):1007-1012. 被引量:7
  • 3张秀梅,王涛.模糊聚类分析方法在学生成绩评价中的应用[J].渤海大学学报(自然科学版),2007,28(2):169-172. 被引量:25
  • 4张云,冯博琴,麻首强,刘连梦.蚁群-遗传融合的文本聚类算法[J].西安交通大学学报,2007,41(10):1146-1150. 被引量:15
  • 5TENENGAUM J B, SILVA V D, LANGFORD J C. A global geometric framework for nonlinear dimension- ality reduction[J]. Science, 2000, 290(5500): 2319- 2323.
  • 6NIU Xi-xian, CUI Yan-ping. Improved clustering algo- rithm based on local agglomerative characteristics [ J ]. E- merging Research in Artificial Intelligence and Computa- tional Intelligence. Springer, 2011, ( 237 ) : 197 - 206.
  • 7Jim Z C, Lai A, Tsung-Jen Huang. An agglomerative clustering algorithm using a dynamic k-nearest-neighbor List[ J]. Information Sciences, 2011, ( 181 ) : 1722 - 1734.
  • 8Jong-Seok Lee, Sigurdur Olafsson. Data clustering by minimizing disconnectivity [ J ]. Information Sciences, 2011,181:732 -746.
  • 9Frey B J, Dueck D. Clustering by Passing Messages between Data Points [J]. Science, 2007, 315:972-976.
  • 10Yang C, Bruzzone L, Guan R C, et al. Incremental and Decremental Affinity Propagation for Semi supervised Clustering in Multispectral Images [J]. IEEE Transactions on Geoscience and Remote Sensing, 2013, 51(3) : 1666 -1679.

引证文献10

二级引证文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部