期刊文献+

一种基于网格的数据流孤立点检测算法 被引量:1

A data stream outlier detection algorithm based on grid
下载PDF
导出
摘要 数据流孤立点检测的主要目的是在合理的时间段内准确发现数据流中的孤立点。传统的孤立点检测算法可以有效发现静态数据集中的孤立点,在动态变化的数据流环境下并不适用,无法及时、有效地发现异常数据。针对数据流环境下对孤立点检测的实时发现、动态调整等要求以及传统算法的不适用,提出了一种新的基于网格的数据流孤立点检测算法ODGrid,ODGrid算法可以实时发现数据流中的异常数据,并根据数据流的变化情况,动态调整检测结果。通过在真实数据集与仿真数据集上的实验,证明了ODGrid算法在精度和速度上优于现有的孤立点检测算法,具有良好的伸缩性。 The traditional existing algorithms can only find outliers in static data sets, but they are inap- plicable for the data stream which leads to the inefficiency in finding abnormal data in the dynamic data stream environment. Due to the inapplicability of the existing algorithms on data stream outlier detection, a new data stream outlier detection algorithm is proposed, which can not only find data stream outliers in real time, but also can adjust detection results dynamically according to the changes of data stream. The results of experiments on real datasets and synthetic datasets show that ODGrid is superior to the existing data stream outlier detection algorithms, and it has good scalability to the dimensionaIity of data space.
出处 《黑龙江大学自然科学学报》 CAS 北大核心 2015年第2期276-280,共5页 Journal of Natural Science of Heilongjiang University
基金 黑龙江省教育厅科学技术研究项目(12531542)
关键词 数据挖掘 数据流 孤立点检测 网格 data mining data stream outlier detection grid
  • 相关文献

参考文献15

  • 1CHEN L, ZOU L J, TU L. A clustering algorithm for multiple data streams based on spectral component similarity [ J ]. Information Sciences, 2012, 183(1) : 35 -47.
  • 2LIU W G, OUYANG J. Clustering algorithm for high dimensional data stream over sliding windows [ C ]. Proceedings of the 10th International Con- ference on Trust, Security and Privacy in Computing and Communieations. Piseataway: IEEE, 2011 : 1537 -1542.
  • 3KNORR E M, NG R T. Algorithms for mining distance-based outliers in large datasets[ C]. Proceedings of the 24th International Conference on Very Large Databases. NJ: ACM Press, 1998:392 -403.
  • 4PARSONS L, HAQUE E, LIU H. Subspace clustering for high dimensional data: a review [ J]. ACM SIGKDD Explorations Newsletter, 2004, 6 (1):90-105.
  • 5YU D, SHEIKHOLESLAMI G, ZHANG A. Findout: Finding oufliers in very large datasets [ J ]. Knowledge and Information Systems, 2002, 4 (4) : 387 -412.
  • 6PAPADIMITIROU S, KITAGAWA H, GIBBONS P B, et al. LOCI: fast outlier detection using the local correlation integral[ C]. Proceedings of the 19th International Conference on Data Engineering. Bangalore: IEEE, 2003:315 -326.
  • 7ANGIULLI F, FASSETI'I F. Detecting distanced-based outliers in streams of data[ C ]. Proceeding of the 16th AC M Conference on Information and Knowledge Management. Lisbon, Portugal, New York: ACM, 2007:811 -820.
  • 8杨宜东,孙志挥,朱玉全,杨明,张柏礼.基于动态网格的数据流离群点快速检测算法[J].软件学报,2006,17(8):1796-1803. 被引量:22
  • 9胡彩平,秦小麟.一种基于密度的局部离群点检测算法DLOF[J].计算机研究与发展,2010,47(12):2110-2116. 被引量:52
  • 10PARK N H, LEE W S. Grid-based subspace clustering over data streams [ C ]. Proceedings of the ACM Conference on Information and Knowledge Management. New York : ACM, 2007 : 801 - 810.

二级参考文献34

  • 1陈卓,孟庆春,魏振钢,任丽婕,窦金凤.一种基于网格和密度凝聚点的快速聚类算法[J].哈尔滨工业大学学报,2005,37(12):1654-1657. 被引量:14
  • 2朱蔚恒,印鉴,谢益煌.基于数据流的任意形状聚类算法[J].软件学报,2006,17(3):379-387. 被引量:51
  • 3孙焕良,鲍玉斌,于戈,赵法信,王大玲.一种基于划分的孤立点检测算法[J].软件学报,2006,17(5):1009-1016. 被引量:16
  • 4薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 5Breunig M M,Kriegel H P,Ng R T,et al.LOF:Identifying density-based local outliers[C]//Proc of ACM SIGMOD Conf.New York:ACM,2000:427-438.
  • 6Tang J,Chen Z,Fu A,et al.Enhancing effectiveness of outlier detections for low-density patterns[C]//Proc of Advances in Knowledge Discovery and Data Mining 6th Pacific Asia Conf.Berlin:Springer,2002:535-548.
  • 7Papadimitirou S,Kitagawa H,Gibbons P B,et al.LOCI:Fast outlier detection using the local correlation integral[C]//Proc of the 19th Int Conf on Data Engineering.Los Alamitos:IEEE Computer Society,2003:315-326.
  • 8Sanjay C,Pei Sun.SLOM:A new measure for local spatial outliers[J].Knowledge and Information Systems,2006,9(4):412-429.
  • 9Barnett V,Lewis T.Outliers in Statistical Data[M].New York:John Wiley and Sons,1994.
  • 10Johnson T,Kwok I,Ng R T.Fast computation of 2-dimensional depth contours[C]//Proc of the 4th Int Conf on Knowledge Discovery and Data Mining (KDD'98).New York:ACM,1998:224-228.

共引文献98

同被引文献19

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部