摘要
为了解决现有子空间聚类算法时间复杂度偏高和对输入参数敏感的问题,提出了一种基于联合熵矩阵的子空间聚类算法.通过计算每个属性实例分布的熵降维,计算任意两个维度的联合熵,形成联合熵矩阵,在联合熵矩阵中搜索最高阶全1子矩阵作为兴趣子空间,最后在兴趣子空间完成聚类.在人工数据集和公开数据集上的实验表明,与传统子空间聚类算法相比,新算法能以较低的开销识别维度更高的兴趣度子空间.
Recent subspace clustering research results suffer from two problems: firstly, they typically scale exponentially with the data dimensionality or the subspace dimensionality of clusters. Secondly, present methods are often sensitive to input parameters. To overcome these limitations, a subspace clustering algorithm based on united entropy matrix ( UEM ) is presented. In the method, entropy is used to filter out redundant attributes and UEM is used to store united entropy of each two attributes. This method finds all interesting subspaees in UEM by searching all-one sub matrix. Finally, all subspace clusters can be gotten by clustering on interesting subspaces. The evaluation on both synthesis and real datasets show that our approach outperforms traditional subspace clustering methods and provides enhanced quality for finding subspaee clusters with higher dimensions.
出处
《北京邮电大学学报》
EI
CAS
CSCD
北大核心
2014年第3期104-108,共5页
Journal of Beijing University of Posts and Telecommunications
基金
国家自然科学基金项目(61272515
61374214)
北京高等学校青年英才计划项目(YETP0453
YETP0474)
中央高校基本科研业务费专项资金项目(2013RC0501)
国家云计算示范工程项目
国家高技术研究发展计划项目(2013AA12A201)
太原市-中关村合作专项项目(130104)
关键词
子空间聚类
联合熵
兴趣子空间
subspace clustering
united entropy
interesting subspace