期刊文献+

一种基于相似度量的离群点检测方法 被引量:2

A Kind of Outlier Detection Algorithm Based on Similarity Measurement
下载PDF
导出
摘要 离群点检测在是数据挖掘的重要领域,广泛应用在信用卡欺诈检测、网络入侵检测等重要方面,文中在结合层次聚类和相似性,给出高维数据的相似度量函数与类密度的概念,并基于类密度重新定义高维数据的离群点,从而提出一种基于相似度量的离群点检测算法;实验表明:算法对高维数据中的离群点检测有一定的价值。 Outlier detection is an important content in data mining and is widely used in the field of credit card fraud detection, network invasion detection and so on. According to hierarchical clustering and similarity, this paper presents the concept of high dimensional data similarity measurement function and class density, based on class density,the outlier of high dimensional data is redefined so that a kind of outlier detection algorithm based on similarity measurement is proposed. Experiment shows that this algorithm has certain value on outlier detection in high dimensional data.
出处 《重庆工商大学学报(自然科学版)》 2012年第10期96-100,共5页 Journal of Chongqing Technology and Business University:Natural Science Edition
基金 安徽省教育厅自然科学基金项目(05010428)
关键词 离群点 网络入侵 数据挖掘 层次聚类 相似性度量 outlier network invasion data mining hierarchical clustering similarity measurement
  • 相关文献

参考文献10

  • 1HAWKINS D. Identifications of Outliers[ M ]. London : Chapman and Hall, 1980.
  • 2EKNORR R. Algorithms for mining distance-based outliers in large datasets [ A ]. In Proc of the24th VLDBConf[ C ]. NewYork: MorganKaufmann, 1998 : 392403.
  • 3HAN J W, DAMBER M. Data Mining : Concepts andTechnologies [ M ]. SanFrancisco : Morgan Kaufmann 2001.
  • 4ROUSSEEUW P J, LEROY A M. Robust Regression and Outlier Detection[ M ]. New York:John Wiley& Sons, 1987.
  • 5RAKESH A, IJOHANNES G, DMIITRIOS G,et al. Automat ic Subspace Clustering of High Dimensional Data for Data Mining Application [ C ] //Proceedings of the 1998 ACMSIGMOD Internation a Conference on Management of Data, Seattle, Washington, 1998.
  • 6GGARWAL A,PROCOPIUC C, WOLF J L, et al. Fast al- gorithmsf or projected clustering [ C ] //Proc. of the ACMSIGMOD Conference Philadel Phia, P A, 1999:61-72.
  • 7XU Z S, XIA M M. Distance and similarity measures for hesitant fuzzy sets [ J ]. Information Sciences ,2011. 2128-2138.
  • 8AGRAWAL R, GEHRKE J. GUNOPOLOS D, et al . Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In ACM SIGMOD Conference, 1998.
  • 9贺玲,吴玲达,蔡益朝.高维空间中数据的相似性度量[J].数学的实践与认识,2006,36(9):189-194. 被引量:20
  • 10黄斯达,陈启买.一种基于相似性度量的高维数据聚类算法的研究[J].计算机应用与软件,2009,26(9):102-105. 被引量:13

二级参考文献12

  • 1汪祖媛,庄镇泉,王煦法.逐维聚类的相似度索引算法[J].计算机研究与发展,2004,41(6):1003-1009. 被引量:5
  • 2贺玲,吴玲达,蔡益朝.高维空间中数据的相似性度量[J].数学的实践与认识,2006,36(9):189-194. 被引量:20
  • 3Rakesh Agrawal,Johannes Gehrke, Dimitrios Gunopulos, et al . Automatic Subspace Clustering of High Dimensional Data for Data Mining Application [ C ]//Proceedings of the 1998 ACM-SIGMOD International Conference on Management of Data, Seattle, Washington, 1998.
  • 4Aggarwal C C, Procopiuc C, Wolf J L, et al. Fast algorithms for projected clustering [ C ]//Proc. of the ACM SIGMOD Conference Philadel- Phia,PA,1999:61 -72.
  • 5Agrawal R, Gehrke J. Gunopolos D, et al. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In ACM SIGMOD Conference, 1998.
  • 6Sudipto Guha, Rajeev Rastogi, Kyuseok Shim CURE. An Efficient Clustering Algorithm for Large Databases [ C ]//Proceedings of the ACM SIGMOD international conference on Management of data. New York: ACM Press, 1998:73 - 84.
  • 7Yannis Sismanis. Nick Roussopoulos. The dwarf data cube eliminates the high dimensionality eurse[R]. TR-CS4552. University of Maryland, 2003.
  • 8Pitor Indyk. Rajeev Motvani. Approximate nearest neighbo::s: Toward removing the curse of dimensionality[C].In ACM Symposium on Theory of Computing. 1998.
  • 9Bellmann R. Adaptive Control Processes: A Guided Tour[M]. Princeton University Press. 1961.
  • 10Jerome H Friedman. Flexible metric nearest neighbor classification [R]. Technical Report, Department of Statistics, Stanford University, 1994.

共引文献29

同被引文献26

  • 1许枫,丛鸿文.侧扫声纳声图判别[J].海洋测绘,2001,21(1):58-61. 被引量:21
  • 2薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 3Resnick P, Varian H R. Recommender systems [ J ]. Communi- cations of the ACM,1997,40(3) :56-58.
  • 4Han Jiawei, Micheline K. Data mining: concepts and tech- niques[ M]. 2nd ed. San Francisco: Mogran Kaufmann Pub- lishers ,2006.
  • 5Guido B F,Flavio M. Outlier detection in large data sets[ J]. Computers and Chemical Engineering ,2011,35:388-390.
  • 6Patil V A, Ragha L. Comparing performance of collaborative filtering algorithms [ C ]//Proc of 2012 international confer- ence on communication,information & computing technology. Mumbai, India : [ s. n. ] ,2012 : 1-6.
  • 7Mehta B, Hofmann T, Fankhauser P. Lies and propaganda : de- tecting spam users in collaborative filtering [ C ]//Proceedings of the 12th international conference on intelligent user inter- faces. Honolulu, Hawaii : ACM ,2007 : 14-21.
  • 8Itaf N, Ghafoor A,Zia U. An attack resistant method for detec- ting dishonest recommendations in pervasive computing envi- ronment[ C]//Proc of 18th IEEE international conference on network. Singapore : IEEE ,2012 : 173-178.
  • 9Chung Chen-Yao, Hsu Ping-Yu, Huang Shih-Hsiang. A no- vel approach to filter out malicious rating profiles from recom- mender systems[ J]. Decision Support Systems ,2013,55 ( 1 ) : 314-325.
  • 10Breuning M M, Kriegel H P, Ng R T, et al. LOF : identifying density-based local outliers[ C ]//Proc of ACM SIGMOD con- ference. New York, USA : ACM Press ,2000:427-438.

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部