期刊文献+

基于PCA及属性距离和的孤立点检测算法 被引量:3

Algorithm for outlier detection based on principal component analysis and sum of attributes distance
下载PDF
导出
摘要 提出了一种基于主分量分析和属性距离和的孤立点检测算法。该方法首先通过主分量分析方法从众多属性中提取出满足累计贡献率的主分量,同时利用PCA变换矩阵把原始数据集转换到由主分量组成的新的特征空间上,之后对转换后的数据集用属性距离和的方法对孤立点进行检测。实验结果证明了基于主分量分析和属性距离和的孤立点检测算法的有效性。 An outlier detection algorithm based on principal component analysis and the sum of attributes distance is proposed. The algorithm firstly extracts the principal components from many attributes satisfying accumulative contribution rate.Simultaneously,by the PCA matrix original dataset is transformed to a new feature space composed of principal component.Then outliers are detected using the approach of the sum of attributes distance in the transformed datasets.The results of the experiment show that the outlier detection algorithm based on principal component analysis and the sum of attributes distance is effective.
出处 《计算机工程与应用》 CSCD 北大核心 2009年第17期139-141,243,共4页 Computer Engineering and Applications
基金 国家自然科学基金(No.60773100) 教育部科学技术研究重点项目(No.205014) 河北省教育厅科研计划项目(No.2006143)~~
关键词 孤立点 主分量分析 累计贡献率 属性距离和 outlier principal component analysis accumulative contribution rate the sum of attributes distance
  • 相关文献

参考文献10

  • 1Hawkins D.Identification of outliers[M].London:Chapman and Hall, 1980.
  • 2Han J,Kamber M.数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2002:223-262.
  • 3Knorr E M,Ng R T.Algorithms for mining distance-based outliers in large datasets[C]//Proc of Int Conf Very Large Databases(VLDB' 98),New York,USA, 1998:392-403.
  • 4Knorr E M,Ng R T,Tucakov V.Distance-based outliers:Algorithms and applications[J].The VLDB Journal:Very Large Databases,2000,8 ( 3-4 ) : 237-253.
  • 5Bay S D,Schwabacher M.Mining distance-based outliers in near linear time with randomization and a simple pruning rule[C]//The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(SIGKDD2003),Washington,DC,USA,2003: 29-38.
  • 6Ghoting A,Parthasarathy S,Otey M.Fast mining of distance- based oufliers in high dimensional datasets,Technical Report, TR71, CSE[R].The Ohio State University, 2005 : 608-612.
  • 7Jackson D A,Yong C.Robust principal component analysis and outlier detection with ecological data [J].Environmetrics, 2004,15 (2): 129-139.
  • 8Lalor G C,Zhang C.Multivariate outlier detection and remediation in geochemical databases[J].The Science of the Total Environment, 2001,28( 1 ) :99-109.
  • 9The Professional Hockey Server[EB/OL].http ://moo.hawaii.ed-u : 1749/hockey/hockey.html.
  • 10陆声链,林士敏.基于距离的孤立点检测及其应用[J].计算机与数字工程,2004,32(5):94-97. 被引量:23

二级参考文献10

  • 1K.Yamanishi and J.Takeuchi,A Unifying Framework for Detecting Outliers and ChangePoints from Non-Stationary Time Series Data,SIGKDD'02 Edmonton,Alberta,Canda,2002.
  • 2E.M.Knorr,R.T.Ng and V.Tucakov,Distance-Based Outliers:Algorithms and Applications,VLDB Journal:VeryLarge Databases,2000:237-253.
  • 3J.LauriKKala,M.Juhola and E.Kentala,Informal Identification of Outliers in Medical Data,5th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology,(IDAMAP-2000).
  • 4S.D.Bay,M.Schwabacher,Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule,SIGKDD'03,Washington,DC,USA,2003.
  • 5S.Ramaswamy,R.Rastogi and K.Shim,Efficient Algorithms for Mining Outliers from Large Data Sets,InProceedings of the ACM SIGMOD Conference,2000.
  • 6F. Angiulli and C. Pizzuti,Fast Outlier Detection in High Dimensional Spaces,InProccedings of the Sixth European Conferenceon the Principles of Data Mining and Knowledgs Discovery,2002:15-16.
  • 7C.C. Aggarwal and P.S. Yu,Outlier Detection for High Dimensiona Data,In Proceedings of the ACM SIGMOD International Conferenceon Management of data,2001.
  • 8L.Grossi,Outlier Detection for the Quality Assessment of Data Sets,http:∥europa.eu.int/en/comm/eurostat/research/conferences/ntts-98/agenda.htm.
  • 9.NHL data[EB/OL].http:∥moo.hawaii.edu:1749/hockey/hockey.html,.
  • 10JiaweiHan MichelineKamber 范明 孟小峰 译.数据挖掘概念与技术[M].北京:机械工业出版社,2002..

共引文献25

同被引文献23

  • 1陆声链,林士敏.基于距离的孤立点检测研究[J].计算机工程与应用,2004,40(33):73-75. 被引量:44
  • 2何国辉,甘俊英.PCA类内平均脸法在人脸识别中的应用研究[J].计算机应用研究,2006,23(3):165-166. 被引量:29
  • 3Dantzer R, Mormede P. Psychoneuroimmunology of stress [M]. Oxford, England: John Wiley & Sons, 1995: 47-67.
  • 4Evans P D, Edgerton N. Life-events and mood as predictors of the common cold [J]. British Journal of MedicalPsychology, 1991, 64: 35-44.
  • 5Murray L R, Arnott J L. Towards the simulation of emotion in synthetic speech: a review of the literature on humanvocal emotion [J]. Journal of the Acoustical Society of America, 1993, 93(2): 1097-1108.
  • 6Nicholson J, Takahashi K, Nakatsu R. Emotion recognition in speech using neural networks [J]. Neural Computing &Applications, 2000, 9(4): 290-296.
  • 7Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors [J]. Nature, 1986, 323:533-536.
  • 8Hagan M T, Menhaj M B. Training feedforward networks with the marquardt algorithm [J]. IEEE Transactions onNeural Networks, 1994, 5(6): 989-993.
  • 9Saini L M, Soni M K. Artificial neural network based peak load forecasting using levenberg-marquardt andquasi-newton methods [J]. IEE Proceedings-Generation, Transmission and Distribution, 2002, 149(5): 578-584.
  • 10He L, Guo L, Li H. Emotion speech recognition under sadness conditions [J]. Advanced Materials Research, 2012,488-489: 1329-1334.

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部