摘要
为了提高高维数据集合离群数据挖掘效率,该文分析传统的离群数据挖掘算法,提出一种离群点检测算法。该算法将非线性问题转化为高维特征空间中的线性问题,利用核函数-主成分进行维数约减,逐个扫描数据对象的投影分量,判断数据点是否为离群点,适用于线性可分数据集的离群点、线性不可分数据集的离群点的检测。实验表明了该算法的优越性。
The data dimension reduction is a method that can enhance the outliers mining efficiency based on higher-dimension data set.This paper analyzes classical outlier mining algorithm,proposes a novel outlier detection algorithm,transforms nonlinear large-scale data into linear data in the feature space,and introduces a kernel function and principal component data transformation to reduce data dimension.On the basis of each resulting vector,it is determined which data is outlier data one by one.This paper shows that the algorithm is used to detect linear separable outlier data,and to detect nonlinear inseparable outlier data.Experimental results indicate that the algorithm is predominant.
出处
《计算机工程》
CAS
CSCD
北大核心
2008年第8期82-84,共3页
Computer Engineering
关键词
维数消减
核函数
主成分
dimension reduction
kernel function
principal component