期刊文献+

一种孤立点挖掘的混合核方法

Hybrid Method for Outliers Detection Using GPLVM and SVM
下载PDF
导出
摘要 孤立点是不具备数据一般特性的数据对象。支持向量机(SVM)将数据点映射到高维特征空间,通过划分最大间隔的超平面来分离孤立点和正常点。利用支持向量机在处理小样本、高维数及泛化性能强等方面的优势,提出了一种新的基于高斯过程潜变量模型(GPLVM)和支持向量分类的检测模型算法。利用GPLVM提供潜变量到数据空间的平滑概率映射实现数据降维,然后通过SVM交叉验证进行孤立点检测。在KDD99数据集上进行了仿真实验,数值结果表明该算法在保证低误报率的前提下能有效地提高检测率,证明了方法的有效性。 Outlicrs arc objects that do not comply with the general behavior of the data. SVM(support vector machine)finds the maximal margin hyperplane in feature space for the purpose of distinguishing the outliers from normal samp1es. Based on the high performance of SVMs in tackling small sample size, high dimension and its good generalization,we proposed a new method for outlicr detection, which combines a novel unsupervised algorithm GPLVM(Gaussian process latent variable model) with standard SVM. GPLVM provides a smooth probabilistic mapping from latent to data space, embeds the dataset in a low-dimensional space which is used for cross validation of SVM I'he proposed approach was applied to KDD99 benchmark problems, and the simulation results show its validity.
作者 田江 顾宏
出处 《计算机科学》 CSCD 北大核心 2010年第3期245-247,共3页 Computer Science
基金 国家自然科学基金(60605022)资助
关键词 孤立点检测 支持向量机 数据降维 高斯过程潜变量模型 Outlier detection, Support vector machine, Dimensionality reduction, GPLVM
  • 相关文献

参考文献15

  • 1Han J, Kamber M. Data Mining: Concepts and Techniques [M]. Morgan Kaufmann, 2006.
  • 2Petrovskiy M I. Outlier Detection Algorithms in Data Mining Systems[J]. Programming and Computer Software, 2003, 29 (4) : 228.
  • 3Smola A J. Learning with Kernels [M]. CambridgeMass: MIT Press, 2002.
  • 4Vapnik V N. The Nature of Statistical Learning Theory[M]. Springer, 2000.
  • 5Blanchard G, Zwald L. Finite-dimensional projection for classification and statistical learning[J]. IEEE Transactions on Information Theory,2008,54(9) :4169-4182.
  • 6Bouchaffra D, Amira A. Structural hidden Markov models for biometrics:Fusion of face and fingerprint [J]. Pattern Recognition,2008,41 (3) :852-867.
  • 7Lawrence N D. Gaussian process latent variable models for visualization of high dimensional data[M]. Advances in Neural Information Processing Systems (NIPS) 16, Cambridge, MA: MIT Press, 2004.
  • 8Lawrence N D. Probabilistic non-linear principal component analysis with Gaussian process latent variable models[J]. Journal of Machine Learning Research, 2005,6 : 1783-1816.
  • 9Eeiolaza L, Alkarouri M, Lawrence N D, et al. Gaussian Process Latent Variable Models for Fault Detection[C]//Computational Intelligence and Data Mining. CIDM 2007. IEEE Symposium on. 2007 : 287-292.
  • 10Cheng M H, Ho M F, Huang C L. Gait analysis for human identification through manifold learning and HMM. [J]. Pattern Recognition, 2008,41 (8) : 2541-2553.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部