摘要
借鉴万有引力思想提出了一种差异性度量方法和度量类偏离程度的方法,以此为基础提出了一种基于聚类的异常检测方法。该异常检测方法关于数据集大小和属性个数具有近似线性时间复杂度,适合于大规模数据集。理论分析以及在真实数据集上的实验结果表明,该方法是有效的,稳健并且实用。
Based on the idea of the law of gravity, the method measuring dissimilarity and the method measuring a cluster departure from the whole are presented. Based on these, an outlier detection approach based on clustering, named EOD, is introduced. The time complexity of the detection approach is nearly linear with the size of dataset and the number of attributes, which results in good scalability and adapts to large dataset. The theoretic analysis and the experimental results on real datasets show that the approach is effective, robust and practicable.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第7期166-168,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60503048
60673191)
广东外语外贸大学基金资助重点项目(GW2005-1-012)
关键词
聚类
异常因子
异常检测
Clustering
Outlier factor
Outlier detection