期刊文献+

一种鲁棒的子空间聚类算法 被引量:4

A Robust Subspace Clustering Algorithm
下载PDF
导出
摘要 针对聚类分析常面临的维数灾难和噪声污染问题,将样本加权思想与子空间聚类算法相结合,提出了一种鲁棒的子空间聚类算法.该算法结合现有子空间聚类方法,为每个类簇计算一个反映各维度聚类贡献程度的权矢量,并利用该权矢量对各维度加权组合,得到各类簇所处的子空间.此外,算法还为每个样本分配一个反映离群程度的尺度参数,以区分正常样本和离群点在聚类过程中的地位,保证算法的鲁棒性.在二维数据集、高维数据集以及基因数据集上的对比实验结果表明,对于具有不同噪声比例的各种维度数据集,该算法均能取得较高的聚类精度,表现出较好的鲁棒性. A new algorithm is presented to simultaneously solve the problems that clustering suffers from the curse of dimensionality as well as noise contamination. Following some existing idea, the algorithm associates a weight vector to each cluster in the entire data space, and captures the contribution degrees of dimensions for identifying the cluster. Different subspaces for discovering clusters are obtained by combining dimensions via those weight vectors. Furthermore, the algorithm assigns a scalar value to each sample to discriminate the role of outliers from that of normal samples during the clustering process; therefore, the robustness of the algorithm is guaranteed. Experimental results show that the proposed algorithm gains high clustering accuracy on datasets of different dimensions with various noise ratios added.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2011年第6期13-19,共7页 Journal of Xi'an Jiaotong University
基金 国家自然科学基金资助项目(61070137 60933009) 陕西省科技攻关资助项目(2009K1-56)
关键词 子空间聚类 鲁棒性 权参数 最优化 subspace clustering robustness weight optimization
  • 相关文献

参考文献17

  • 1PARSONS L, HAQUE E, LIU H. Subspace clustering for high dimensional data: a review [J]. SIGKDD Explorations, 2004, 6(1): 90-105.
  • 2AGRAWAL R, GEHRKE J, GUNOPULOS D, et al. Automatic subspace clustering of high dimensional data for data mining applications [C] ///Proceedings of ACM SIGMOD International Conference on Management of Data. New York, USA: ACM, 1998: 94- 105.
  • 3AGGARWAL C, PROCOPIUC C, WOLF J L, et al. Fast algorithms for projected clustering [C]//Proceedings of ACM SIGMOD International Conference on Management of Data. New York, USA: ACM, 1999:61-72.
  • 4DOMENICONI C, PAPADOPOULOS D, GUNOPULOS D, et al. Subspace clustering of high dimensional data [C]//Proceedings of SIAM International Conference on Data Mining. Philadelphia, PA, USA: SIAM, 2004: 517-521.
  • 5JING Liping, NG M K, HUANG J Z. An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data [J].. IEEE Transactions on Knowledge and Data Engineering, 2007, 19 (8): 1026-1041.
  • 6DOMENICONI C, GUNOPULOS D, MA S, et al. Locally adaptive metrics for clustering high dimensional data [J]. Data Mining and KnoMedge Discovery, 2007, 14(1) :63-97.
  • 7DINGC, HE Xiaofeng, ZHA Hongyuan, et al. Adaptive dimension reduction for clustering high dimensional data[C]//Proceedings of IEEE International Conference on Data Mining. Piscataway, NJ, USA: IEEE, 2002: 147-154.
  • 8DAVE R N, KRISHNAPURAM R. Robust clustering models: a unified view [J]. IEEE Transactions on Fuzzy Systems, 1997, 5(2): 270-293.
  • 9穆向阳,张太镒,周亚同.一种鲁棒的概率主成分分析方法[J].西安交通大学学报,2008,42(10):1217-1220. 被引量:3
  • 10DING Yuanyuan, DANG Xin, PENG Hanxiang, et al. Robust clustering in high dimensional data using statistical depths [J]. BMC Bioinformatics, 2007, 8 (S7) :S8.

二级参考文献16

共引文献52

同被引文献45

  • 1张云,冯博琴,麻首强,刘连梦.蚁群-遗传融合的文本聚类算法[J].西安交通大学学报,2007,41(10):1146-1150. 被引量:15
  • 2TENENGAUM J B, SILVA V D, LANGFORD J C. A global geometric framework for nonlinear dimension- ality reduction[J]. Science, 2000, 290(5500): 2319- 2323.
  • 3张选平,祝兴昌,马琮.一种基于边界识别的聚类算法[J].西安交通大学学报,2007,41(12):1387-1390. 被引量:5
  • 4Müller E,Günnemann S,Assent I,et al.Evaluating Clustering in Subspace Projections of High Dimensional Data[J].Proceedings of the VLDB Endowment,2009,2(1):1270-1281.
  • 5Assent I,Krieger R,Muller E,et al.INSCY:Indexing Subspace Clusters with In-process-removal of Redundancy[C]//Proceedings of the 8th IEEE International Conference on Data Mining.Washington D.C.,USA:IEEE Press,2008:719-724.
  • 6Agrawal R,Gehrke J,Gunopulos D,et al.Automatic Subspace Clustering of High Dimensional Data[J].Data Mining and Knowledge Discovery,2005,11(1):5-33.
  • 7Frank A,Asuncion A.UCI Machine Learning Repository[EB/OL].(2013-11-15).http://archive.ics.uci.edu/ml.
  • 8Moise G,Sander J,Ester M.P3C:A Robust Projected Clustering Algorithm[C]//Proceedings of the 6th International Conference on Data Mining.Washington D.C.,USA:IEEE Press,2006:414-425.
  • 9Sequeira K,Zaki M.SCHISM:A New Approach for Interesting Subspace Mining[C]//Proceedings of the 4th IEEE International Conference on Data Mining.Washington D.C.,USA:IEEE Press,2004:186-193.
  • 10李静耘,杜正春,楚国莉,方万良.基于聚类的多运行方式下电力系统稳定器设计[J].西安交通大学学报,2008,42(2):204-208. 被引量:1

引证文献4

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部