期刊文献+

基于最近邻距离差的改进孤立点检测算法 被引量:10

Improved outlier detection algorithm based on difference between nearest neighbors distance
下载PDF
导出
摘要 k最近邻孤立点检测算法的检测结果受用户设置参数的影响较大,并且无法判定孤立点强弱,针对该缺陷,引入阈半径和密集度阈值,提出基于最近邻距离差的孤立点检测算法。通过在多个数据集上的实验表明,改进算法扩大了参数的设置范围,降低了参数对结果的影响,并能够有效检测出强孤立点,用户通过调整密集度阈值,可以判定孤立点强弱,改进算法增强了原算法的稳定性和灵活性。 Results of k nearest neighbor outlier detection algorithm are affected by parameters set by users deeply and are unable to determine the strength. In order to eliminate this defect, the threshold radius and density threshold is introduced and improved outlier detection algorithm is presented based on difference between nearest neighbors distance. The experimental results of sev- eral data sets show that improved algorithm extends the span of parameters, reduces the impact of parameters on the results and can effectively detect strong outliers. By setting the intensity threshold, users can determine the strength of the outliers. The improved algorithm enhances the stability and flexibility of the original algorithm.
出处 《计算机工程与设计》 CSCD 北大核心 2013年第4期1265-1269,共5页 Computer Engineering and Design
基金 国家自然科学基金项目(60970059 61170136) 山西省自然基金项目(2011011015-4) 山西省青年基金项目(2011021013-3) 太原理工大学校青年基金项目(K201021)
关键词 孤立点检测 最近邻距离差 参数设置 k最近邻 强孤立点 outlier detection difference between nearest neighbors distance~ parameter settings~ k nearest neighbor~ strong outliers
  • 相关文献

参考文献12

二级参考文献122

共引文献113

同被引文献97

  • 1陆声链,林士敏.基于距离的孤立点检测及其应用[J].计算机与数字工程,2004,32(5):94-97. 被引量:23
  • 2王伟平,李建中,张冬冬,郭龙江.基于滑动窗口的数据流连续J-A查询的处理方法[J].软件学报,2006,17(4):740-749. 被引量:18
  • 3李强,李振东.数据挖掘中孤立点的分析研究在实践中应用[J].微计算机应用,2006,27(3):323-327. 被引量:9
  • 4王宏鼎,童云海,谭少华,唐世渭,杨冬青.异常点挖掘研究进展[J].智能系统学报,2006,1(1):67-73. 被引量:22
  • 5薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 6Todorov V, Temple M, Filzmoser E Detection of multivari- ate outliers in business survey data with incomplete informa- tion [J]. Advances in Data Analysis and Classificaton, 2011, 5(1 ): 37-56.
  • 7Heard B A, Weston D J, Platanioti K, et al. Bayesian anom- aly detection methods for social network [J]. Annals of Ap- plied Statistics, 2010, 4(2): 645-662.
  • 8Koufakou A, Georgiopoulos M. A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes [J]. Data Mining and Knowledge Discovery, 2010, 20: 259-289.
  • 9Prasanta G, Bhttacharyya D K, Borah B, et al. A survey of out- lier detection methods in network anomaly identification [J]. Computer Journal, 2011, 54(4): 570-588.
  • 10Christian B, Katrin H, Nikola S M, et al. CoCo: Coding cost for parameter-free outlier detection [C]//Proceedings of the 15th A CM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press 2009: 149-158.

引证文献10

二级引证文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部