期刊文献+

基于核函数-主成分维数约减的离群点检测 被引量:1

Outliers Detection Based on Kernel Function-Principle Component Dimension Reduction
下载PDF
导出
摘要 为了提高高维数据集合离群数据挖掘效率,该文分析传统的离群数据挖掘算法,提出一种离群点检测算法。该算法将非线性问题转化为高维特征空间中的线性问题,利用核函数-主成分进行维数约减,逐个扫描数据对象的投影分量,判断数据点是否为离群点,适用于线性可分数据集的离群点、线性不可分数据集的离群点的检测。实验表明了该算法的优越性。 The data dimension reduction is a method that can enhance the outliers mining efficiency based on higher-dimension data set.This paper analyzes classical outlier mining algorithm,proposes a novel outlier detection algorithm,transforms nonlinear large-scale data into linear data in the feature space,and introduces a kernel function and principal component data transformation to reduce data dimension.On the basis of each resulting vector,it is determined which data is outlier data one by one.This paper shows that the algorithm is used to detect linear separable outlier data,and to detect nonlinear inseparable outlier data.Experimental results indicate that the algorithm is predominant.
出处 《计算机工程》 CAS CSCD 北大核心 2008年第8期82-84,共3页 Computer Engineering
关键词 维数消减 核函数 主成分 dimension reduction kernel function principal component
  • 相关文献

参考文献6

  • 1Beyer K, Goldstein J, Ramakri R, et al. When is Nearest Neighbor Meaningful?[C]//Proceedings of the 7th International Conference on Data Theory.[S. l.]: Springer, 1999: 217-235.
  • 2Li Yajun. Reforming the Theory of Invariant Moments for Pattern Recognition[J]. Pattern Recognition, 1992, 25(7): 723-730.
  • 3Giudici R Applied Data Ming: Statistical Methods for Business and Industry[M]. Beijing, China: Electronics Industry Press, 2004.
  • 4Suykens J A K, Gestel Chiond T V, Vandewalle J, et al. A Support Vector Machine Formulation to PCA Analysis and Its Kernel Version[R]. Leuven, Belgium: Katholieke University, Technical Report: 200268-2002, 2002.
  • 5Scholkopf B, Smola A, Muller K R. Nonlinear Component Analysis as a Kernel Eigenvalue Problem[J]. Neural Computation, 1998, 10(5): 1299-1319.
  • 6The Third International Knowledge Discovery and Data Mining Tools Competition Dataset[Z] (1999-02-01). http://kdd.ics. uci.edu/databases/kddcup99/kddcup99.html

同被引文献7

  • 1杨新宇,曾明,赵瑞,吴航.分形理论在网络流量分析中的应用综述[J].计算机工程,2004,30(23):17-18. 被引量:2
  • 2Ramaswamy S,Rastogi R,Kyuseok S.Efficient Algorithms for Mining Outliers from Large Data Sets[C]//Proe.of 2000 ACM SIGMOD International Conference on Management of Data.Dallas,Texas,USA:[s.n.],2000:93-104.
  • 3Angiulli F,Pizzuti C.Outlier Mining in Large High-dimensional Data Sets[J].IEEE Tram.on Knowledge and Data Engineering,2005,17(2):203-215.
  • 4He Zengyou,Deng Shengchun,Xu Xiaofei.An Optimization Model for Outlier Detection in Categorical Data[C]//Proc.of 2005International Conference on Intelligent Computing.Hefei,China:[s.n.],2005:400409.
  • 5Aggarwal C C,Philip S Y.Outlier Detection for High-dimensional Data[C]//Proc.of 2001 ACM SIGMOD International Confcfence on Management of Data.Santa Barbara,USA:[s.n.],2001:37-46.
  • 6Cristofor D,Simovici D.Finding Median Paaitions Using Information Theoretical-based Genetic Algofithms[J].Journal of Universal Computer Science,2002,8(2):153-172.
  • 7孙亦南,刘伟军,王越超.基于分形理论和数学形态学的图像边缘检测方法[J].计算机工程,2003,29(20):20-21. 被引量:7

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部