期刊文献+

基于混合网格划分的子空间高维数据聚类算法 被引量:4

A Subspace Clustering Algorithm of High Dimension Data Based on Hybrid-Grid Partitioning
下载PDF
导出
摘要 提出一种基于混合网格划分的子空间高维数据聚类算法。该算法消除了各个属性分量数值范围大小对计算的影响;有效去除冗余属性以提高聚类准确性与降低时间复杂度。根据数据分布情况灵活选择固定网格划分或是自适应网格划分,利用这二种不同的网格划分方法具有的优点,以实现进一步降低算法的时间复杂度和提高聚类结果的准确性,并使算法具有更优的可伸缩性。实验使用仿真数据表明,该算法在处理具有属性值域范围大的高维大规模数据时是实用有效的。 A subspace clustering algorithm of high dimension data set based on hybrid-grid partitioning is proposed.The impact of attribute values range to the calculation is eliminated,filtering out redundant attributes is effective to enhance the clustering accuracy and reduce time complexity.The flexibility to choose a fixed or adaptive grid partition using the advantage of them to improve time complexity and the accuracy of clustering according to the data distribution.The algorithm has better scalability,too.A set of experiments on a synthetic dataset demonstrate the effectiveness and efficiency of the algorithms when clustering on high dimensional and large-scale data with the big range of the attribute value.
作者 许倡森
出处 《计算机技术与发展》 2010年第10期150-153,共4页 Computer Technology and Development
关键词 高维聚类 子空间聚类 相对熵 网格划分 high dimensional clustering subspace clustering relative entropy grid partition
  • 相关文献

参考文献9

  • 1邓庚盛,刘承启,熊艳.基于网格和密度的CLIQUE聚类算法的研究与实现[D].南昌:南昌大学,2008.
  • 2Cheng C H,Fu A W,Zhang Y.Entropy-based subspace clustering for mining numerical data[C] ∥Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.San Diego,CA:ACM Press,1999:84-93.
  • 3Goil S,Nagesh H,Choudhary A.MAFIA:Efficient and scalable subspace clustering for very large datasets[R].Evanston:Northwestern University,1999.
  • 4陈慧萍,王煌,王建东.子空间聚类算法的研究新进展[D].常州:河海大学计算机信息工程学院,2007.
  • 5周晓云,孙志挥,张柏礼.一种大规模高维数据集的高效聚类算法[D].南京:东南大学,2006.
  • 6He J,Lan M,Tan C L,et al.Initialization of cluster refinement algorithms:a review and comparative study[C] //Proceedings of IEEE International Joint Conference on Neural Networks.USA:IEEE Computer Society,2004:297-302.
  • 7刘佳佳,胡孔法,陈凌.基于单维分割的高维数据聚类算法HDCA-SDP[D].扬州:扬州大学,2008.
  • 8夏英,李克非.基于属性相关性分析的子空间搜索算法[D].成都:西南交通大学,2009.
  • 9张伟莉,倪志伟,赖建章.一种新的基于网格的聚类算法[D].合肥:合肥工业大学,2008.

同被引文献28

  • 1王国仁,黄健美,王斌,韩东红,乔百友,于戈.基于最大间隙空间映射的高维数据索引技术[J].软件学报,2007,18(6):1419-1428. 被引量:9
  • 2牛琨,张舒博,陈俊亮.采用属性聚类的高维子空间聚类算法[J].北京邮电大学学报,2007,30(3):1-5. 被引量:13
  • 3Yang Q, Wu X. 10 Challenging Problems in Data Mining Research[J].International Journal of Information Technology and Decision Making 2006, 5(4) : 597-604.
  • 4Tan S, Cheng X, Ghanem M, et al. A Novel Refinement Approach for Text Categorization [C ]//Proceedings of the 14th ACM Conference on Information and Knowledge Management, 2005 : 469-476.
  • 5Agrawal R, Gehrke J , Gunopulosd D, et al . Automatic Sabspace Clustering of High Dimensional Data for Data Mining Applications [C]//Proceedings of ACM SIGMOD International Conference on Management of Data, 1998.
  • 6Aggarwal C C,Wolf J L, et al . Fast Algorithms for Projected Clustering[C ]//Proceedings of ACM SIGMOD International Conference on Management of Data, 1999.
  • 7Aggarwal C C ,P S Yu. Finding Generalized Projected Clusters in High Dimensional Space [C]//Proceedings of ACM SIGMOD International Conference on Management of Data, 2000.
  • 8G Gan,J Wu,A Convergence Theorem for the Fuzzy Subspace Clustering (FSC) Algorithm [J].Pattem Recognition,2008,41 (6) : 1939-1947.
  • 9L Jing, M Ng, J Huang. An Entropy Weighting K-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data [J].IEEE Transaction on Knowledge and Data Engineering,2007,19 (8) : 1026- 1041.
  • 10Chu Y, Chen Y, Yang D, et al. Reducing Redundancy in Subspace Clustering [J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(10): 1432-1446.

引证文献4

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部