期刊文献+

一种两阶段异常检测方法 被引量:7

Two-Stage Outlier Detection Approach
下载PDF
导出
摘要 提出了一种新的距离和对象异常因子的定义,在此基础上提出了一种两阶段异常检测方法TOD,第一阶段利用一种新的聚类算法对数据进行聚类,第二阶段利用对象的异常因子检测异常.TOD的时间复杂度与数据集大小成线性关系,与属性个数成近似线性关系,算法具有好的扩展性,适合于大规模数据集.理论分析和实验结果表明TOD具有稳健性和实用性. In this paper, a new distance definition and outlier factor of object are introduced. On the basis of these, a two-stage outlier detection approach, named [WTBXTOD[WTBZ, is presented, the first stage cluster data by a new clustering method, the second stage identify outliers by the outlier factor of objects. The time complexity of [WTBXTOD[WTBZ is linear with the size of dataset and nearly linear with the number of attributes, which results in good scalability and adapts to large dataset. The theoretic analysis and the experimental results show that the [WTBXTOD[WTBZ is robust and practicable.
出处 《小型微型计算机系统》 CSCD 北大核心 2005年第7期1237-1240,共4页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(60273075)资助
关键词 聚类 异常因子 异常检测 clustering outlier factor outlier detection
  • 相关文献

参考文献9

  • 1Knorr E M, Ng R T.Algorithms for mining distance-based outliers in large datasets[C]. In:Proc. 24th Int. Conf. On Very Large Data Bases,New York, NY, 1998:392-403.
  • 2Shenyi-Yi Jiang,Qing-Hua Li,Ken-Li Li,Hui Wang,Zhong-Luo Meng.GLOF:a new approach for mining local outlier[C]. Int. Conf. Mach. Learn. Cybern, 2003,11: 157-162.
  • 3He Zeng-you, Xu Xiao-fei, Deng Sheng-chun. Discovering cluster-based local outliers[J]. Pattern Recognition Letters,2003,24(9-10):1651-1660.
  • 4Leonid Portnoy, Eleazar Eskin and Salvatore J. Stolfo.Intrusion detection with unlabeled data using clustering[C].In:Proc of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001). Philadelphia, PA, 2001.
  • 5Harkins S,He H,Willams G J, Baster R A. Outlier detection using replicator neural networks[C]. In:Proc. of the 4^th Int. Conf. on Data Warehousing and Knowledge Discovery, Aix-en-Provence France,2002:170-180.
  • 6何增有,徐晓飞,邓胜春.Squeezer:An Efficient Algorithm for Clustering Categorical Data[J].Journal of Computer Science & Technology,2002,17(5):611-624. 被引量:31
  • 7Guha S,Rastogi R,Shim K. ROCK:A robust clustering algorithm for categorical attributes[C].In:Proc. of the 15th ICDE,Sydney Australia, 1999,512-521.
  • 8Merz C J, Merphy P. UCI repository of machine learning databases[EB/OL]. URL: http://www.ics.uci.edu/ mlearn/ MLRRepository.html,1996.
  • 9Eskin E,Arnold A,Prerau M,Portnoy L, Stolfo S. A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data[C]. In:In BarbaraDand Jajodia S(editors), Applications of Data Mining in Computer Securuty, Kluwer,2002.

二级参考文献17

  • 1Sudipto Guha, Rajeev Rastogi, Kyuseok Shim. ROCK: A robust clustering algorithm for categorical attributes. In Proc. 1999 Int. Conf. Data Engineering, Sydney, Australia, Mar., 1999, pp.512-521.
  • 2Alexandros Nanopoulos, Yannis Theodoridis, Yannis Manolopoulos. C2P: Clustering based on closest pairs. In Proc. 27th Int. Conf. Very Large Database, Rome, Italy, September, 2001, pp.331-340.
  • 3Ester M, Kriegel H P, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases.In Proc. 1996 Int. Conf. Knowledge Discovery and Data Mining (KDD'96), Portland, Oregon, USA, Aug., 1996,pp.226-231.
  • 4Zhang T, Ramakrishnan R, Livny M. BIRTH: An efficient data clustering method for very large databases. In Proc.the ACM-SIGMOD Int. Conf. Management of Data, Montreal, Quebec, Canada, June, 1996, pp.103-114.
  • 5Sudipto Guha, Rajeev Rastogi, Kyuseok Shim. CURE: A clustering algorithm for large databases. In Proc. the ACM SIGMOD Int. Conf. Management of Data, Seattle, Washington, USA, June, 1998, pp.73-84.
  • 6Karypis G, Han E-H, Kumar V. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 1999, 32(8): 68-75.
  • 7Sheikholeslami G, chatterjee S, Zhang A. WaveCluster: A multi-resolution clustering approach for very large spatial databases. In Proc. 1998 Int. Conf. Very Large Databases, New York, August, 1998, pp.428-439.
  • 8Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. the 1998 ACM SIGMOD Int. Conf. Management of Data, Seattle, Washington,USA, June, 1998, pp.94-105.
  • 9Jiang M FI Tseng S S, Su C M. Two-phase clustering process for outliers detection. Pattern Recognition Letters,2001, 22(6/7): 691-700.
  • 10Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan. CACTUS-clustering categorical data using summaries.In Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining, August, 1999, pp.73-83.

共引文献30

同被引文献51

  • 1蒋盛益,李庆华.一种基于引力的聚类方法[J].计算机应用,2005,25(2):286-288. 被引量:9
  • 2蒋盛益,李庆华,王卉,孟中楼.一种增强的局部异常挖掘方法[J].计算机研究与发展,2005,42(2):210-216. 被引量:8
  • 3黄光球,彭绪友,靳峰.基于密度的异常挖掘方法研究与应用[J].微电子学与计算机,2005,22(3):262-265. 被引量:8
  • 4蒋盛益.基于投票机制的融合聚类算法[J].小型微型计算机系统,2007,28(2):306-309. 被引量:7
  • 5蒋盛益,姜灵敏.一种高效异常检测方法[J].计算机工程,2007,33(7):166-168. 被引量:7
  • 6Patcha A, Park J M. An overview of anomaly detection techniques: Existing solutions and latest technological trends[J]. Comp Networks ,2007,51 (12) :3448.
  • 7Jiang M F, Tseng S S, Su C M. Two-phase clustering process for outliers detection[ J]. Computational Statistics and Data Analysis,2001,36 (3) :351.
  • 8Portnoy L, Eskin E, Stolfo S. Intrusion detection with unla- beled data using clustering[ C ]//Proc of the ACM Work- shop on Data Mining Applied to Security, Philadelphia: PA,2001:5 - 8.
  • 9He Z, Xu X, Deng S. Discovering cluster-based local outli- ers [ J ]. Pattern Recognition Letters, 2003, 24 ( 9 - 10) :1651.
  • 10Fred A L. Finding consistent clusters in data partitions [ C ]//Procs of the Second Int Workshop on Multiple Classifier Syst Lecture Notes in Comp Sci, London: Snrineer-Verlag.2001.309 - 318.

引证文献7

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部