摘要
目前无论是查找一般的离群点,还是空间离群点,都强调非空间属性的偏离,但在图像处理、基于位置的服务等许多应用领域,空间与非空间属性要综合考虑。为此,首先提出了一个综合考虑两者的空间离群点定义,然后提出了一种新的基于密度的空间离群点查找方法———基于密度的跳跃取样空间离群点查找算法DBSODLS。由于已有的基于密度的离群点查找方法对每一点都要求进行邻域查询计算,故查找效率低,而该算法由于可充分利用已知的邻居信息,即不必计算所有点的邻域,从而能快速找到空间离群点。分析与试验结果表明,该算法时间性能明显优于目前已有的基于密度的算法。
Existing work in outlier detection emphasizes the deviation of non-spatial attribute not only in outlier detecting in statistical database but also in spatial outlier detecting in spatial database. However, both spatial and non-spatial attributes must be synthetically considered in many applications, such as image processing, position-based service. We defined outlier in respect of taking account of both spatial and non-spatlal attributes and proposed a new density-based spatial outlier detecting approach with leapingly sampling( DBSODLS). Existing density-based outlier detection approaches must calculate neighborhoods of every object, which are time-consuming. This method makes the best of neighbor information that have been detected, leapingly selects the next object, but not every object, which reduces many neighborhood queries. Theoretical comparison shows this method is better than other density-based methods in efficiency, and the experimental results also show that the approach outperforms the existing density-based methods in efficiency.
出处
《中国图象图形学报》
CSCD
北大核心
2006年第9期1230-1236,共7页
Journal of Image and Graphics
基金
国家"863"计划资助项目(2001AA633010-04)
国家自然科学基金项目(49971063)
江苏省自然科学基金项目(BK2001045)
关键词
数据挖掘
空间离群点
空间数据库
影响域
data mining
spatial outliers
spatial database
impact neighborhood