期刊文献+

基于方形邻域的离群点查找新方法 被引量:16

New Approach Based on Square Neighborhood to Detect Outliers
下载PDF
导出
摘要 提出一种基于密度的快速查找离群点的算法——基于方形邻域的离群点查找算法(ODBSN),该算法把DBSCAN算法的邻域改造成方形邻域,并吸收基于网格算法的思想,用密集的方形邻域快速排除非离群点;用邻域扩张的思想代替网格划分克服了基于网格算法中“维灾”缺点;同时用局部偏离指数指示离群点的偏离程度,又具有识别精度高和偏离程度可度量的优点.理论分析表明该算法性能优于著名的基于密度的算法,实验表明,ODBSN算法能在各种形状分布与各种密度的数据中有效地查找离群点,速度明显优于LOF与DBSCAN算法. A new quick denslty-based approach to detect outliers, called outlier detecting based on square neighborhood (ODBSN), is presented. This algorithm changes the t-neighborhood in DBSCAN to a square neighborhood and judges if the neighbors in the dense square neighborhood are not outlier. The algorithm partitions objects with square neighborhood, not with spatial grids, and thus does not cause "dimension curse". The algorithm ean indicate the degree of outlier with the loeal deviate factor, so the outlier can be identified exactly and the precision is measurable. Theoretical comparison shows that this method is more efficient than the well-known algorithm based on density, DBSCAN and LOF. Experimental results more efficient that the proposed approach can effectively identify outliers in databases within clusters that have different shape and varied density, and it is several times faster than the original DBSCAN and LOF algorithm.
出处 《控制与决策》 EI CSCD 北大核心 2006年第5期541-545,554,共6页 Control and Decision
基金 国家自然科学基金项目(49971063) 国家"863"海洋监测主题子课题基金项目(2001AA633010-04) 江苏省自然科学基金项目(BK2001045)
关键词 数据挖掘 离群点 方形邻域 Data mining Outliers Square neighborhood
  • 相关文献

参考文献16

  • 1Breunig M M,Kriegel H P,Ng R T,et al.LOF:Identifying Density-based Local Outliers[A].Proc of SIGMOD'00[C].Dallas,2000:427-438.
  • 2Ester M,Kriegel H P,Sander J,et al.A Densitybased Algorithm for Discovering Clusters in Large Spatial Databases[A].Proc of KDD'96[C].Portland OR,1996:226-231.
  • 3Barnett V,Lewis T.Outliers in Statistical Data[M].New York:John Wiley,1994.
  • 4Hawkins D M.Identification of Outliers[M].London:Chapman and Hall,1980.
  • 5Rousseeuw P J,Leroy A M.Robust Regression and Outlier Detection[M].New York:John Wiley and Sons,1987.
  • 6Johnson T,Kwok I,Ng R T.Fast Computation of 2-dimensional Depth Contours[A].Proc KDD[C].New York:AAAI Press,1998:224-228.
  • 7Knorr E,Ng R.A Unified Notion of Outliers:Properties and Computation[A].Proc of the Int Conf on Knowledge Discovery and Data Mining[C].New York:AAAI Press,1997:219-222.
  • 8Knorr E,Ng R.Algorithms for Mining Distance-based Outliers in Large Datasets[A].Proc 24th VLDB Conf[C].New York:Morgan Kaufmann Publisher,1998.
  • 9Ramaswamy S,Rastogi R,Kyuseok S.Efficient Algorithms for Mining Outliers from Large Data Sets[A].Proc of SIGMOD'00[C].Dallas,2000:93-104.
  • 10Guha S,Rastogi R,Kyuseok S.Rock:A Robust Clustering Algorithm for Categorical Attributes[A].Proc of ICDE'99[C].Sydney,1999:512-521.

同被引文献112

引证文献16

二级引证文献76

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部