期刊文献+

基于子空间聚类的高维数据可视分析方法综述 被引量:8

A survey of high dimensional data visual analysis methods based on subspace clustering
下载PDF
导出
摘要 随着信息技术的飞速发展和大数据时代的来临,数据呈现出高维性、非线性等复杂特征。对于高维数据来说,在全维空间上往往很难找到反映分布模式的特征区域,而大多数传统聚类算法仅对低维数据具有良好的扩展性。因此,传统聚类算法在处理高维数据的时候,产生的聚类结果可能无法满足现阶段的需求。而子空间聚类算法搜索存在于高维数据子空间中的簇,将数据的原始特征空间分为不同的特征子集,减少不相关特征的影响,保留原数据中的主要特征。通过子空间聚类方法可以发现高维数据中不易展现的信息,并通过可视化技术展现数据属性和维度的内在结构,为高维数据可视分析提供了有效手段。总结了近年来基于子空间聚类的高维数据可视分析方法研究进展,从基于特征选择、基于子空间探索、基于子空间聚类的3种不同方法进行阐述,并对其交互分析方法和应用进行分析,同时对高维数据可视分析方法的未来发展趋势进行了展望。 With the rapid development of information technology and the advent of big data era, the data show the complex features of high dimensionality and nonlinearity. For high-dimensional data, it is often difficult to find feature regions that reflect distribution patterns in full-dimensional space, but most of the traditional clustering algorithms only have good scalability for low-dimensional data. Therefore, when the traditional clustering algorithm processes high-dimensional data,the clustering results may not meet the needs of the current stage. The subspace clustering algorithm searches for clusters existing in the high-dimensional data subspace, and divides the original feature space of data into different subsets of features to reduce the influence of uncorrelated features and preserve the main features in the original data. The subspace clustering method can find the information that is not easy to show in high-dimensional data and display the internal structure of data attributes and dimensions through visualization techniques, which provides an effective method for visual analysis of high-dimensional data. This paper summarizes the research progress of high-dimensional data visual analysis methods based on subspace clustering in recent years, and elaborates three different methods based on feature selection,subspace exploration and subspace clustering. Then, the methods and applications of its interaction analysis are analyzed,and the future development trends of visual analysis methods of high-dimensional data are prospected.
作者 田帅 陈谊 TIAN Shuai;CHEN Yi(Beijing Key Laboratory of Big Data Technology for Food Safety,School of Computer and Information Engineering,Beijing Technology and Business University,Beijing 100048,China)
出处 《计算机工程与应用》 CSCD 北大核心 2018年第13期19-26,共8页 Computer Engineering and Applications
基金 "十二五"国家科技支撑计划(No.2012BAD29B01-2) 国家科技基础性工作专项(No.2015FY111200) 北京市科技计划课题(No.Z151100001615041) 虚拟现实技术与系统国家重点实验室开放基金(No.BUAA-VR-17KF-07) 2018年研究生科研能力提升计划项目
关键词 高维数据 可视分析 子空间探索 子空间聚类 high dimensional data visual analysis subspace exploration subspace clustering
  • 相关文献

参考文献11

二级参考文献148

  • 1孙平,徐宗本,申建中.基于核化原理的非线性典型相关判别分析[J].计算机学报,2004,27(6):789-795. 被引量:11
  • 2Mukkamala S, Sung A H, Ribeiro B.Model selection for kernel based intrusion detection systems[C]//Proceedings of the International Conference on Adaptive and Natural Computing Algorithms, Coimbra, Portugal, 2005: 458-461.
  • 3Downs T,Gates K E,Masters A.Exact simplification of support vector solutions[J].Journal of Machine Learning Research,2001, 2: 293-297.
  • 4Seholkopf B, Mika S, Burges C, et al.Input space in kemel-based methods[J].IEEE Trans on Neural Networks, 1999, 10 (5): 1000-1017.
  • 5Hotelling H.Analysis of a complex of statistical variables into principal components[J].Journal of Educational Psychology, 1933,24(6) :417-441.
  • 6Parsons L,Haque E,Liu H.Subspace clustering for high dimensional data:a rcvicw[J].ACM SIGKDD Explorations Newsletter, 2004,6( 1 ) :90-105.
  • 7Berkhin P.Survey of clustering data mining techniques[M]//Kogan J, Nicholas C, TebouUe M.G-rouping multidimensional data: recent advances in clustering.Berlin:Springer-Verlag,2006:25-71.
  • 8Aggarwal C C, Procopiuc C, Wolf J L, et al.Fast algorithm for projected clustering[C]//Delis A, Faloutsos C, Ghandeharizadeheds S.Proc of the ACM-SIGMOD.New York:ACM Press,1999: 61-71.
  • 9Agrawal R, Gehrke J, Gunopulos D, et al.Automatic subspace clustering of high dimensional data[J].Data Mining and Knowledge Discovery, 2005,11 (1) : 5-33.
  • 10Domeniconi C, Gunopulos D, Ma S, et al.Locally adaptive metrics for clustering high dimensional data[J].Data Mining and Knowledge Discovery,2007,14( 1 ) :63-97.

共引文献407

同被引文献61

引证文献8

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部