期刊文献+

基于距离的孤立点检测研究 被引量:44

Research of Distance-based Outliers Detection
下载PDF
导出
摘要 孤立点检测是一个重要的知识发现任务,在分析基于距离的孤立点及其检测算法的基础上,文章提出了一个判定孤立点的新定义,并设计了基于抽样的近似检测算法,用实际数据进行了实验。实验结果表明,新的定义不仅与DB(p,d)孤立点定义有着相同的结果,而且简化了孤立点检测对用户的要求,同时给出了数据对象在数据集中的孤立程度。 Outlier detection is an important task in knowledge discovery.After analyzing distance-based outlier and the algorithms for detecting outliers,this paper proposes a new definition to judge outlier,and develops a sampling-based approximate detection algorithm.Experiments have been carried out with real data.The experimental results indicates that not only the newly definition get the same results as DB(p,d)'s but also the definition simplifies the requirement for detecting outliers.It points out the outlier's outlying degree in the dataset as well.
出处 《计算机工程与应用》 CSCD 北大核心 2004年第33期73-75,94,共4页 Computer Engineering and Applications
关键词 孤立点检测 孤立点 数据采掘 抽样 outlier detection,outlier,data mining,sampling
  • 相关文献

参考文献9

  • 1E M Knorr,R T Ng,V Tucakov. Distance-Based Outliers :Algorithms and Applications[J].VLDB Journal:Very Large Databases,2000:237~253
  • 2S D Bay,M Schwabacher. Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule[C].In:SIGKDD '03, Washington, DC, USA ,2003
  • 3J Laurikkala,M Juhola,E Kentala. Informal Identification of Outliers in Medical Data[C].In :5th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, (IDAMAP-2000) ,2000
  • 4K Yamanishi,J Takeuchi.A Unifying Framework for Detecting Oulliers and Change Points from Non-Stationary Time Series Data[C].In:SIGKDD '02 Edmonton,Alberta,Canda,2002
  • 5S Ramaswamy,R Rastogi,K Shim. Efficient Algorithms for Mining Outliers from Large Data Sets[C].In:Proceedings of the ACM SIGMOD Conference, 2000: 473~438
  • 6Wen Jin,K H Tung,Jiawei Han. Mining Top-n Local Outliers in Large Databases[C].In:KDD 2001 San Francisco,California USA
  • 7JiaweiHan MichelineKamber 范明 孟小峰 译.数据挖掘概念与技术[M].北京:机械工业出版社,2002..
  • 8F Angiulli,C Pizzuti.Fast Outlier Detection in High Dimensional Spaces[C].In:Proccedings of the Sixth European Conference on the Principles of Data Mining and Knowledge Discovery,2002:15~16
  • 9NHL data.http://moo. Hawaii.edu: 1749/hockey/hockey.html

共引文献14

同被引文献276

引证文献44

二级引证文献250

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部