期刊文献+

基于相关子空间的高维离群数据检测算法 被引量:3

High Dimensional Outlier Detection Algorithm Based on Correlation Subspace
下载PDF
导出
摘要 为了提高离群数据检测精度和效率,提出了一种基于相关子空间的离群数据检测算法。该算法首先根据数据局部密度分布特征得出稀疏度矩阵,通过高斯相似核函数放大稀疏度特征;然后计算各属性维中数据稀疏度相似因子,确定子空间向量及相关子空间,结合数据稀疏度和维度权值得出数据对象的离群因子,选取最大的若干个对象为离群数据;最后采用人工数据集和UCI实验数据集验证算法准确性和有效性。 In order to improve the accuracy and efficiency of outlier detection,an outlier detection algorithm based on correlation subspace is proposed.Firstly,the sparsity matrix is obtained according to the local density distribution of data,and the sparsity feature is amplified by Gaussian similarity kernel function.Then,the data sparsity similarity factor in each attribute dimension is calculated,and the subspace vector and correlation subspace are determined;The outlier factors of data objects are obtained by combining data sparsity and dimension weight,and the largest objects are selected as outlier data.Finally,the artificial data set and UCI experimental data set are used to verify the accuracy and effectiveness of the algorithm.
作者 赵向兵 张天刚 ZHAO Xiang-bing;ZHANG Tian-gang(School of Computer and Network Engineering,Shanxi Datong University,Datong,Shanxi 037009,China)
出处 《计算技术与自动化》 2022年第1期82-86,共5页 Computing Technology and Automation
基金 山西省教育科学“十三五”规划项目(GH-18044) 山西大同大学科研基金项目(2017K11) 山西大同大学教学改革创新项目(XJG2020211)。
关键词 数据挖掘 离群数据 稀疏度 高斯核函数 相似度因子 相关子空间 仿真实验 算法分析 data mining outlier data sparsity Gaussian kernel function similarity factor correlation subspace simulation experiment algorithm analysis
  • 相关文献

参考文献6

二级参考文献41

  • 1刘靖明,韩丽川,侯立文.基于粒子群的K均值聚类算法[J].系统工程理论与实践,2005,25(6):54-58. 被引量:122
  • 2薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 3ROUSSEEUW P,LEROY A. Robust regression and outlier detection[M]. [S.l.] : Wiley, 1987.
  • 4HE Zeng-you,XU Xiao-fei, DENG Sheng-chun. Discovering cluster-'based local outlier[J]. Pattern Recognition Letters,2003,24 ( 9-10):1641-1650.
  • 5BARNETT V,LEWIS T. Outliers in statistical data[ M]. [ S. 1.].Wiley, 1994.
  • 6BREUIG M M,KRIEGEL H P,NG R T,e( al. LOF:identifying densi-ty-based local outliers [ C] //Proc of ACM SIGMOD International Con-ference on Management of Data. 2000:93-104.
  • 7AGGARWAL C C, YU P S. Outlier detection for high dimensionaldata[ C]//Proc of ACM SIGMOD International Conference on Man-agement of Data. 2001:37-46.
  • 8BAUMGARTNER C,PLANT C, RAILING K, et al. Subspace selec-tion for clustering high-dimensional data[ C] //Proc of the 4th IEEEInternational Conference on Data Mining. 2004 : 11 -18.
  • 9NIU Dong-lin, DY J G,JORDAN M I. Multiple non-redundant spec-tral clustering views [ C]//Proc of ICML.-2010 :831-838.
  • 10KELLER F, MULLER E,B0HM K. HiCS:high contrast subspacesfor density-based outlier ranking[ C] //Proc of the 28th IEEE Interna-tional Conference on Data Engineering. 2012.

共引文献66

同被引文献33

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部