期刊文献+

基于KNN图的两阶段孤立点检测及应用研究 被引量:1

Novel two-stage approach based on KNN graph for outlier detection and its application research
下载PDF
导出
摘要 针对两种基于KNN图孤立点检测方法:入度统计法(ODIN)和K最邻近(K-nearest Neighbor,RSS)算法的不足,提出了一种新的改进方法:两阶段孤立点检测方法,并进行了适当扩充使之适用于数据集中孤立点数目未知情况下的孤立点检测。算法应用于"小样本,高维度"的基因微阵列数据集进行样本孤立点检测取得了很好效果,证明了此方法的有效性。 Aiming at overcoming the shortcoming of two KNN graph based outlier detection methods:Outlier Detection using Indegree Number (ODIN) algorithm and K-nearest neighbor (RSS) algorithm,this paper proposes a novel improved approach:twostage KNN graph based outlier detection method.This method can be employed to detect the outliers of datasets with the number of outliers being unknown.Appling it into the "small sample,high dimension" microarray datasets achieves a good result.
出处 《计算机工程与应用》 CSCD 北大核心 2008年第2期186-189,共4页 Computer Engineering and Applications
关键词 孤立点检测 KNN图 微阵列数据 outlier detection KNN graph microarray datasets
  • 相关文献

参考文献12

  • 1Yamanishi K,Takeuchi J.A unifying framework for detecting outliers and change points from non-stationary time series data[C]//SIGKDD'02,Edmonton,Alberta,Canada,2002.
  • 2LauriKKala J,Juhola M,Kentala E.Information identification of outliers in medical data[C]//5th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology,IDAMPA,2000.
  • 3Hautamaki V,Karkkainen I,Franti P.Outlier detection using knearest neighbour graph[C]//17th International Conference on Pattern Recognition (ICPR'04).Oakland:IEEE Computer Press,2004:430-433.
  • 4Ramaswamy S,Rastogi R,Shim K.Efficient algorithms for mining outliers from large data sets[C]//Proceedings of the 2000 ACM SIGMOD Int Conf on Management of Data,Dallas,Texas,May 2000:427-438.
  • 5Han Jia-wei,Kanlber M.数据挖掘概念与技术[M].范明,孟小峰,译.北京:北京机械工业出版社,2002:223-259.
  • 6Franti P,Virmajoki O,Hautamaki V.Graph-based agglomerative clustering[C]//Proceedings of The Third IEEE Int Conf on Data Mining,Melbourne,Florida,November 2003:525-528.
  • 7Alon U,Barkai N,Nootterman D A,et al.Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays[J].Proc Natl Acad Scince,2001,4:727-739.
  • 8Kadota K,Tominaga D,Akiyama Y,et al.Detecting outlying samples in micrearray data:a critical assessment of the effect of outliers on sample classification[J].Chem-Bio Informatics Journal,2003,3(1):30-45.
  • 9Golub T R,Slonim D K,Tamayo P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286:531-537.
  • 10Marques J P.Pattern recognition conceptes,methods and applications[M].Berlin Heideberg:Springer-Verlag,2002.

同被引文献13

  • 1薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 2Aggarwal C C,Yu P S.Outlier detection for high dimensionaldata[C].Proc of ACM International ConferenceManagement of Data.New York,USA:ACM Press,2001.
  • 3Ester M,Kriegel H P,Sander J,et al.A density-basedalgorithm for discovering clusters in large spatial databaseswith noise[C].Proc 2nd Int Conf on Knowledge Discoveryand Data Mining(KDD-96).Portland:ACM Press,1996:226-231.
  • 4Daszykowski M,Walczak B,Massart D L.Looking fornatural patterns in data[J].Chemometrics and Intelligent Laboratory Systems,2001,56(2):83-92.
  • 5Hawkins D.Identification of outliers[M].London:Chapmanand Hall,1980.
  • 6Knorr E M,Ng R T,Tucakov V.Distance-based outliers:algorithms and applications[J].VLDB Journal:Very LargeDatabases,2000:237-253.
  • 7Ramaswamy S,Rastogi R,Shim K.Efficient algorithmsfor mining outliers from large data sets[C].Proceedingsof the ACM SIGMOD Conference,2000:437-438.
  • 8Angiulli F,Pizzuti C.Fast outlier detection in high dimensionalspaces[C].Proceedings of the Sixth European Conferenceon the Principle of Data Mining and KnowledgeDiscovery,2002:15-16.
  • 9于亚飞,周爱武.一种改进的DBSCAN密度算法[J].计算机技术与发展,2011,21(2):30-33. 被引量:35
  • 10张悦,刘杰,李航.一种基于概率的孤立点检测方法[J].计算机工程,2013,39(3):46-50. 被引量:2

引证文献1

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部