摘要
孤立点是不具备数据一般特性的数据对象。支持向量机(SVM)将数据点映射到高维特征空间,通过划分最大间隔的超平面来分离孤立点和正常点。利用支持向量机在处理小样本、高维数及泛化性能强等方面的优势,提出了一种新的基于高斯过程潜变量模型(GPLVM)和支持向量分类的检测模型算法。利用GPLVM提供潜变量到数据空间的平滑概率映射实现数据降维,然后通过SVM交叉验证进行孤立点检测。在KDD99数据集上进行了仿真实验,数值结果表明该算法在保证低误报率的前提下能有效地提高检测率,证明了方法的有效性。
Outlicrs arc objects that do not comply with the general behavior of the data. SVM(support vector machine)finds the maximal margin hyperplane in feature space for the purpose of distinguishing the outliers from normal samp1es. Based on the high performance of SVMs in tackling small sample size, high dimension and its good generalization,we proposed a new method for outlicr detection, which combines a novel unsupervised algorithm GPLVM(Gaussian process latent variable model) with standard SVM. GPLVM provides a smooth probabilistic mapping from latent to data space, embeds the dataset in a low-dimensional space which is used for cross validation of SVM I'he proposed approach was applied to KDD99 benchmark problems, and the simulation results show its validity.
出处
《计算机科学》
CSCD
北大核心
2010年第3期245-247,共3页
Computer Science
基金
国家自然科学基金(60605022)资助