期刊文献+

基于邻域离散度的异常点检测算法 被引量:21

Outlier Detection Algorithm Based on Dispersion of Neighbors
下载PDF
导出
摘要 异常点检测在机器学习和数据挖掘领域中有着十分重要的作用。当前异常点检测算法的一大缺陷是正常数据在边缘处异常度较高,导致在某些情况下误判异常点。为了解决该问题,提出了一种新的基于邻域离散度的异常点检测算法。该算法将数据点所在邻域的离散度作为该数据点的异常度,既能有效避免边缘数据点的异常度过高,又能较好地区分正常点与异常点。实验结果表明,该算法能够有效地检测数据中的异常点,并且算法对参数选择不敏感,性能较为稳定。 Outlier detection is an important task of machine learning and data mining. A major limitation of the existing outlier detection methods is that the outlierness of border points may be very high, leading to yield misleading results in some situations. To cope with this problem, this paper proposes a novel outlier detection algorithm based on the dispersion of neighbors. The proposed algorithm adopts the dispersion of a data point??s neighbors as its outlier degree,thus the outlierness of border points will not be very high while the normal data and outliers can still be well distinguished.The experimental results show the proposed algorithm is more effective in detecting outliers, less sensitive to parameter settings and is stable in terms of performance.
作者 沈琰辉 刘华文 徐晓丹 赵建民 陈中育 SHEN Yanhui;LIU Huawen;XU Xiaodan;ZHAO Jianmin;CHEN Zhongyu(College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua, Zhejiang 321004, China)
出处 《计算机科学与探索》 CSCD 北大核心 2016年第12期1763-1772,共10页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金Nos.61272007 61272468 61572443 浙江省自然科学基金No.LY14F020012 浙江省教育厅项目No.Y201328291~~
关键词 异常点检测 机器学习 数据挖掘 主成分分析 outlier detection machine learning data mining principal component analysis
  • 相关文献

参考文献1

二级参考文献44

  • 1Schena M. Genome analysis with gene expression microarrays. Bioessays, 1996, 18: 427-431.
  • 2Schena M, Shalon-K, Heller R et al. Parallel human genome analysis: Microarray-based expression monitoring of 1,000 genes. In Proc. Natl. Acad. Sci., USA, 93,pp.10614-10619.
  • 3Marshall A, Hodgson J. DNA chips: An array of possibilities. Nat. Biotechnol., 1998, 16: 27-31.
  • 4Ramsay G. DNA chips: State-of-the art. Nat. Biotechnol. 1998. 16: 40-44.
  • 5Fodor S P, Rava R P, Huang X C et al. Multiplexed biochemical assays with biological chips. Nature, 1993,364: 555-556.
  • 6Lipshutz R J, Fodor S P A, Gingeras T R et al. High density synthetic oligonucleotide arrays. Nature Genet.Suppl., 2000, 21: 20-24.
  • 7Harrington C A, Rosenow C, Retief J. Monitoring gene expression using DNA microarrays. Curt. Opin. Microbiol., 2000, 3(3): 285-291.
  • 8Lennon G S, Lehrach H. Hybridization analysis of arrayed cDNA libraries. Trends Genet. 1991, 7: 60-75.
  • 9Drmanac S, Drmanac R. Processing of cDNA and genomic kilobase-size clones for massive screening mapping and sequencing by hybridization. Biotechniques,1994, 17: 328-336.
  • 10Drmanac R, Lennon G, Drmanac Set al. Partial sequencing by oligo hybridization: Concept and applications in genome analysis. In Proc. the First International Conference of Electrophoresis Supercomputing and the Human Genome, Cantor C, Lim H (Eds.), Singapore: World Scientific, 1991, pp.60-75.

共引文献2

同被引文献132

引证文献21

二级引证文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部