基于核函数-主成分维数约减的离群点检测被引量：1

Outliers Detection Based on Kernel Function-Principle Component Dimension Reduction

下载PDF

导出

摘要为了提高高维数据集合离群数据挖掘效率,该文分析传统的离群数据挖掘算法,提出一种离群点检测算法。该算法将非线性问题转化为高维特征空间中的线性问题,利用核函数-主成分进行维数约减,逐个扫描数据对象的投影分量,判断数据点是否为离群点,适用于线性可分数据集的离群点、线性不可分数据集的离群点的检测。实验表明了该算法的优越性。 The data dimension reduction is a method that can enhance the outliers mining efficiency based on higher-dimension data set.This paper analyzes classical outlier mining algorithm,proposes a novel outlier detection algorithm,transforms nonlinear large-scale data into linear data in the feature space,and introduces a kernel function and principal component data transformation to reduce data dimension.On the basis of each resulting vector,it is determined which data is outlier data one by one.This paper shows that the algorithm is used to detect linear separable outlier data,and to detect nonlinear inseparable outlier data.Experimental results indicate that the algorithm is predominant.

作者徐雪松刘耀宗赵学龙张宏刘凤玉

机构地区南京理工大学计算机科学与技术学院

出处《计算机工程》 CAS CSCD 北大核心 2008年第8期82-84,共3页 Computer Engineering

关键词维数消减核函数主成分 dimension reduction kernel function principal component

分类号 TP311.5 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献6

1Beyer K, Goldstein J, Ramakri R, et al. When is Nearest Neighbor Meaningful?[C]//Proceedings of the 7th International Conference on Data Theory.[S. l.]: Springer, 1999: 217-235.
2Li Yajun. Reforming the Theory of Invariant Moments for Pattern Recognition[J]. Pattern Recognition, 1992, 25(7): 723-730.
3Giudici R Applied Data Ming: Statistical Methods for Business and Industry[M]. Beijing, China: Electronics Industry Press, 2004.
4Suykens J A K, Gestel Chiond T V, Vandewalle J, et al. A Support Vector Machine Formulation to PCA Analysis and Its Kernel Version[R]. Leuven, Belgium: Katholieke University, Technical Report: 200268-2002, 2002.
5Scholkopf B, Smola A, Muller K R. Nonlinear Component Analysis as a Kernel Eigenvalue Problem[J]. Neural Computation, 1998, 10(5): 1299-1319.
6The Third International Knowledge Discovery and Data Mining Tools Competition Dataset[Z] (1999-02-01). http://kdd.ics. uci.edu/databases/kddcup99/kddcup99.html

同被引文献7

1杨新宇,曾明,赵瑞,吴航.分形理论在网络流量分析中的应用综述[J].计算机工程,2004,30(23):17-18. 被引量：2
2Ramaswamy S,Rastogi R,Kyuseok S.Efficient Algorithms for Mining Outliers from Large Data Sets[C]//Proe.of 2000 ACM SIGMOD International Conference on Management of Data.Dallas,Texas,USA:[s.n.],2000:93-104.
3Angiulli F,Pizzuti C.Outlier Mining in Large High-dimensional Data Sets[J].IEEE Tram.on Knowledge and Data Engineering,2005,17(2):203-215.
4He Zengyou,Deng Shengchun,Xu Xiaofei.An Optimization Model for Outlier Detection in Categorical Data[C]//Proc.of 2005International Conference on Intelligent Computing.Hefei,China:[s.n.],2005:400409.
5Aggarwal C C,Philip S Y.Outlier Detection for High-dimensional Data[C]//Proc.of 2001 ACM SIGMOD International Confcfence on Management of Data.Santa Barbara,USA:[s.n.],2001:37-46.
6Cristofor D,Simovici D.Finding Median Paaitions Using Information Theoretical-based Genetic Algofithms[J].Journal of Universal Computer Science,2002,8(2):153-172.
7孙亦南,刘伟军,王越超.基于分形理论和数学形态学的图像边缘检测方法[J].计算机工程,2003,29(20):20-21. 被引量：7

引证文献1

1孙金花,胡健,李向阳.基于分形理论的离群点检测[J].计算机工程,2011,37(3):33-35. 被引量：5

二级引证文献5

1刘祥新.熵值距离的离群点检测及其在学生评教中的应用[J].湖北第二师范学院学报,2012,29(2):84-86.
2孙爱程.基于熵距离的离群点检测及其应用[J].无线电工程,2012,42(6):45-47. 被引量：3
3李洪安,康宝生,张婧,佟建锋.面向云计算的主成分分析多变量局域预测模型[J].计算机应用研究,2012,29(11):4170-4175. 被引量：1
4唐琪,刘学军.无线传感器网络离群时间序列检测研究[J].传感技术学报,2013,26(1):95-99. 被引量：3
5李俊丽,芦彩林.离群点检测算法研究[J].计算机与数字工程,2017,45(6):1045-1048. 被引量：2

1徐雪松,张谞,宋东明,张宏,刘凤玉.基于非线性数据变换的离群点检测算法[J].中国工程科学,2008,10(9):74-78. 被引量：3
2徐雪松.非线性数据变换及其在离群聚类中的应用[J].软件导刊,2009,8(10):6-9.
3吴新玲,毋国庆.基于数据变换的维数消减方法[J].武汉大学学报（理学版）,2006,52(1):73-76. 被引量：4
4吴新玲.数据维数消减方法研究[J].计算机工程与设计,2006,27(16):3000-3002. 被引量：2
5王晶,周旷.基于支持向量机的肿瘤基因识别[J].计算机与数字工程,2011,39(9):3-6. 被引量：4
6徐雪松,张宏,刘凤玉.基于核函数距离测度的LLE降维及其在离群聚类中的应用[J].仪器仪表学报,2008,29(9):1996-2000. 被引量：5
7徐雪松,张谓,宋东明,张宏,刘凤玉.基于核的PP主成分分析及其在离群聚类中的应用[J].计算机科学,2007,34(9):131-134. 被引量：1
8任蕾,杨忠根.基于子空间投影技术的椭圆拟合算法[J].上海海事大学学报,2006,27(3):90-94. 被引量：3
9屠红蕾.识别手绘形状的两种算法的实现[J].计算机工程与设计,2009,30(10):2513-2515.
10赵松,张志坚,张培仁.增强的典型相关分析及其在人脸识别特征融合中的应用[J].计算机辅助设计与图形学学报,2009,21(3):394-399. 被引量：16

计算机工程

2008年第8期

浏览历史

内容加载中请稍等...

基于核函数-主成分维数约减的离群点检测被引量：1

参考文献6

同被引文献7

引证文献1

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

基于核函数-主成分维数约减的离群点检测 被引量：1

参考文献6

同被引文献7

引证文献1

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

基于核函数-主成分维数约减的离群点检测被引量：1