期刊文献+

基于聚类和局部信息的离群点检测算法 被引量:1

Outlier Detecting Algorithm Based on Clustering and Local Information
下载PDF
导出
摘要 针对目前大部分离群点检测算法未考虑数据的局部信息,导致离群点检测的准确率低问题,提出一种新的基于聚类和局部信息的两阶段离群点检测算法.通过定义新的局部离群因子作为判断数据对象是否为离群点的衡量标准,改进了传统离群点检测算法的过程.实验结果表明,该算法在保持线性复杂度的同时,能更准确、有效地挖掘出数据集中的离群点. Most existing outlier detection algorithms ignore local information of data sets, they are of low accuracy. We adopted a two-phase algorithm based on k-means clustering algorithm, defined a new local stray factor as the standard to judge whether data objects are outliers. We also improved the process of detecting outliers and solved the above problem. Experiments show that our algorithm overcomes the shortcomings of existing methods, ensure the algorithm has linear time complexity and is able to find outliers in data sets more accurately and effectively.
出处 《吉林大学学报(理学版)》 CAS CSCD 北大核心 2012年第6期1214-1217,共4页 Journal of Jilin University:Science Edition
基金 吉林省科技发展计划重点项目(批准号:20090304)
关键词 离群点检测 K-MEANS聚类 局部离群因子 outlier detecting k-means clustering local outlier factor
  • 相关文献

参考文献10

二级参考文献122

共引文献241

同被引文献18

  • 1薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 2Cassisi C, Ferro A, RosaIba G. Enaneing density- based clustering: parameter reduction and outlier de- tection [J]. Information Systems, 2013, 38 (3): 317-330.
  • 3Han J W, Damber M. Data Mining: Concepts and technologies [ M ]. San Francisco: Morgan Kaufmann, 2001.
  • 4Han J, Kamber M. Data mining: concepts and tech- niques[M]. 2nd ecl. San Francisco: Morgan Kauf- mann, 2006.
  • 5Willams G J,Baster R A, He H, et al. A comparative study of RNN for outlier detection in data mining[C] //Proc of International Conference on Data Mining. Maebashi City,Japan: IEEE, 2002 : 709-712.
  • 6Reiegel H P , Kroger P , Schubert E, et al. Interpre- ting and unifying outlier scores [C]//Proc of the 11th SIAM International conference on Data Mining. [s. I. ] : IEEE, 2011 : 13-24.
  • 7Barnett V, Lewis T. Outliers in statistical data[M]. New York:John Wiley & Sons, 1994.
  • 8Knorr E M, Ng R T. Algorithms for mining distance- based outliers in large datasets[C]// Proc of Int Conf Very Large Data-based(VLDB' 98). Washington DC, USA: IEEE, 1998 : 392-403.
  • 9Breunig M, Kriegel H ,NG R T ,et al. LOF : identif- ying density-based local outliers [C]//Proc of ACM SIGMOD International Conference on Management of Data. New York: ACM Press, 2000:93-94.
  • 10Tang J, Chen Z, Fu A, et al. Enhancing effectiveness of outlier detections for low-density patterns [C]/// Proc of the 6 th Pacific-Asia Conference Advances in Knowledge Discovery and Data Mining. Berlin: Springer-Verlag, 2011 : 270-283.

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部