摘要
基于聚类的孤立点检测算法得到的结果比较粗糙,不够准确。针对该问题,提出一种基于相似孤立系数的孤立点检测算法。定义相似距离以及相似孤立点系数,给出基于相似距离的剪枝策略,根据该策略缩小可疑孤立点候选集,并降低孤立点检测算法的计算复杂度。通过选用公共数据集Iris、Labor和Segment-test进行实验验证,结果表明,该算法在发现孤立点、缩小候选集等方面相比经典孤立点检测算法更有效。
Aiming at the problem that the result of outlier detection algorithm based on clustering is coarser and not very accurate, this paper proposes an outlier detection algorithm based on Approximate Outlier Factor(AOF). This algorithm presents the definition of the similarity distance and outlier similarity coefficient, and provides a pruning strategy based on similarity distance to reduce the suspect candidate sets to decrease the computational complexity. Experiments are carried out with public datasets Iris, Labor and Segment-test, and results show that the performance of detecting outlier and reducing candidate set of this algorithm is effective compared with the classical outlier detection algorithm.
出处
《计算机工程》
CAS
CSCD
2013年第11期200-204,共5页
Computer Engineering
基金
国家科技支撑计划基金资助项目(2012BAH08B01)
湖南省自然科学基金资助项目(12JJ3074)
关键词
聚类孤立点
孤立点检测
相似孤立系数
剪枝策略
孤立点候选集
clustering outlier
outlier detection
Approximate Outlier Factor(AOF)
pruning strategy
outlier candidate set