期刊文献+

基于可变加权的高维数据子空间聚类算法研究 被引量:2

Study of subspace clustering algorithm of high dimensional data based on variable weighting methods
下载PDF
导出
摘要 高维数据的稀疏性和"维灾"问题使得多数传统聚类算法失去作用,因此研究高维数据集的聚类算法己成为当前的一个热点。子空间聚类算法是实现高维数据集聚类的有效方法之一。介绍并实现了基于可变加权的高维数据子空间聚类算法SCAD和EWKM,并分别对人造数据、现实数据等数据集进行测试,根据测试结果进行分析,对比两种算法的性能及适用场合。 The sparsity and the problem of the curse of dimensionality of high--dimensional data, make the most of traditional clustering algorithms lose their action in high-dimensional space. Therefore, clustering of data in a high-dimensional space becomes a hot research area. Subspace clustering algorithm is one of the effective ways to handle problems of high-dimensional data clustering. This paper introduces and realizes two algorithms (SCAD and EWKM) that discover clusters in subspaces spanned by different combinations of dimensions via local weightings of features. We experiment these algorithms using synthetic datasets and real datasets, then analyze the results and contrast their performance and applicable occasions.
出处 《信息化纵横》 2009年第10期55-58,共4页
关键词 高维数据 稀疏 子空间聚类 精确率 high dimensional data sparsity subspace clustering precision entropy
  • 相关文献

参考文献4

  • 1Han Jiawei,Kamber Micheline,范明,孟小峰,等译.数据挖掘概念与技术[M].北京:机械工业出版社,2007:424-479.
  • 2FRIGUI H, NASRAOUI O, Simuhaneous clustering and attribute discrimination[C]. Proceeding of the 9th IEEE International Conference on Fuzzy Systems, 2000.
  • 3JING L. NG M. K. and HUANG. J. Z. An Entropy Weighting K-Means algorithm for subspace clustering of high-dimensional sparse data[J]. IEEE Transactions on Knowledge and Data Engineering, 2007,19(8) : 1-16.
  • 4测试数据集.http://archive.ics.uci.edu/ml/machine-learning-databases.

共引文献42

同被引文献4

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部