摘要
提出了一种数据流概率密度估计方法,在此基础上,经计算得到整个数据集在低维空间投影的信息熵,基于该信息熵实现了一种新的高维数据流的子空间聚类算法EPStream.实验表明,与传统的算法相比,该算法在聚类精度和时间方面都有所提高.
A data stream probability density estimation method is proposed in this paper. With this method,we can obtain the estimated information entropy of high dimensional data projected into lower dimension subspaces,and we can find subspaces with good clustering property based on these estimated information entropy,then we propose a subspace clustering algorithm. The experiment suggests that the proposed algorithm is superior to traditional ones in clustering accuracy and duration.
出处
《安徽师范大学学报(自然科学版)》
CAS
2015年第1期36-39,共4页
Journal of Anhui Normal University(Natural Science)
基金
安徽省高等学校质量工程省级特色专业项目(20101284)
安徽省高等学校质量工程教研项目(20101296)
关键词
高维数据流
聚类算法
信息熵
high-dimensional data streams
clustering algorithm
entropy