摘要
基于支撑点的度量空间离群检测算法旨在尽快提高离群度阀值,以达到快速检测离群点的效果.然而现有的相关算法缺乏有效的支撑点选取方法,造成支撑点选取结果不稳定,最终导致算法性能波动较大.应用于聚类的密度峰值算法不失为一种良好的支撑点选取算法,然而其密度峰值搜索目标难以确定.通过改进密度峰值算法,通过自动确定距离值,计算该范围内对象的数量来确定密度峰值,从而选取出具有最大密度的支撑点,应用于度量空间离群检测之中.实验结果表明该算法较已有算法获得较大的提升,平均加速比为2.41,最高达6.28;距离计算次数平均减少60.67%,最高达91.17%,而建立索引所需时间在可接受范围内.
Pivot based metric space outlier detection algorithm is designed to improve the cutoff value of outlier degree as quickly as possible, in order to achieve faster detection speed. However, due to the lack of efficient pivot selection method, the selected pivot is not stable, resulting in the performance fluctuation of existing related algorithm. The density peak algorithm, which has been widely used in cluster algorithm,can yet be regarded as a good pivot selection method,nevertheless,the search target of density peak is hard to be determined. In this paper, density peak algorithm is improved by automatically determining the value of distance, then comparing the number of objects within the given distance, in order to obtain the pivot with largest density, and apply it to the metric space outlier detection algorithm. The experimental results shows that the proposed algorithm outperforms the existing one, and achieves a 2.41 speed up over it on average and,in certain cases, up to 6.28 ;the distance calculation times are reduced by 60.67% on average and up to 91.17%, as well as acceptable index building time.
出处
《小型微型计算机系统》
CSCD
北大核心
2017年第5期983-987,共5页
Journal of Chinese Computer Systems
基金
国家"八六三"高技术研究发展计划项目(2015AA015305)资助
国家自然科学基金委-广东联合项目(U1301252
U1501254)资助
广东省重点实验室建设项目(2012A061400024)资助
广东省自然科学基金项目(2015A030313636)资助
关键词
离群检测
度量空间
索引
支撑点
密度峰值
outlier detection
metric space
index
pivot selection
density peak