期刊文献+

邻域粗糙集中基于序列的混合型属性离群点检测 被引量:6

Sequence-based Mixed Attribute Outlier Detection in Neighborhood Rough Sets
下载PDF
导出
摘要 离群点检测具有广泛应用.传统粗糙集的离群点检测方法不能有效处理数值型属性数据,故提出邻域粗糙集中基于序列的混合型属性离群点检测方法.该方法采用每个属性值的均匀性来构建属性序列,以此定义属性集序列并构建邻域类序列;进而,通过分析邻域类序列中对象的变化情况来检测离群点,并设计出相应的离群点检测算法(Sequence-based Mixed Attribute Outlier Detection,SM AOD),该算法在计算单属性邻域覆盖的方式上改进了传统的逐一比较计算模式.最后,在UCI标准数据集上与主要离群点检测方法进行实验比较与分析,结果表明所提方法的有效性. Outlier detection has extensive applications. However,the outlier detection method based on classical rough sets cannot effectively deal with the numerical attribute data,and thus a new method of mixed attribute outlier detection is proposed based on sequence.The method constructs the attribute sequence by the variance of each attribute value,and the sequence attribute set is defined to construct the neighborhood class sequence. Then,the outlier is detected by analyzing the object change in the neighborhood class sequence,and the corresponding outlier detection algorithm( Sequence-based Mixed Attribute Outlier Detection,SMOAD) is designed,to improve the traditional one-by-one calculation pattern when computing neighborhood covering of a single attribute. Finally,the experiments are compared with main outlier detection methods via the UCI standard data sets,and the results show the effectiveness of the proposed method.
作者 袁钟 张贤勇 冯山 YUAN Zhong;ZHANG Xian-yong;FENG Shan(College of Mathematics and Software Science, Sichuan Normal University, Chengdu 610068 ,China;Institute of Intelligent Information and Quantum Information,Sichuan Normal University ,Chengdu 610068, China)
出处 《小型微型计算机系统》 CSCD 北大核心 2018年第6期1317-1322,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61673285 61203285)资助 四川省青年科技基金项目(2017JQ0046)资助 四川省教育厅科研基金项目(15ZB0029)资助
关键词 离群点检测 邻域粗糙集 序列 均匀性 混合型属性 数据挖掘 outlier detection neighborhood rough sets sequence variance mixed attribute data mining
  • 相关文献

参考文献2

二级参考文献13

  • 1陈斌,冯爱民,陈松灿,李斌.基于单簇聚类的数据描述[J].计算机学报,2007,30(8):1325-1332. 被引量:18
  • 2Knorr E, Ng R. Algorithms for mining distance-based outliers in large datasets [ C ]. In Proceedings of Very Large Data Base, 1998: 392-403.
  • 3Breunig M, Kriegel H P,Ng R,et al. LOF :identifying density-basedlocal outliers [ C]. Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas: ACM Press, 2000:93-104.
  • 4Beyer K, Goldstein J, Ramakrishnan R, et al. When is nearest neigh- bors meaningful[ C]. International Conference on Digital Telecom- munication ( ICDT' 99 ), 1999 : 217 -235.
  • 5Aggarwal C C, Yu P. Outlier detection for high dimensional data [ C ]. Proc. of the ACM SIGMOD International Conference on Management of Data,2001:37-47.
  • 6Miiller E, Schiffer M, Seidl T. Statistical selection of relevant sub- space projections for outlier ranking[ C]. Proc. of the 19th Interna- tional Conference on Data Engineering ,2011:434-445.
  • 7Keller F, Mtiller E, Bohm K. HiCS: high-contrast subspaces for den- sity-based outlier ranking [ C ]. 28th IEEE International Conference on Data Engineering(ICDE) ,2012 : 1037-1048.
  • 8Charu C. Aggarwal,high-dimensional outlier detection:the subspace method [M. New York:Springer New York,2013:135-167.
  • 9Kailing K, Kriegel H P, Kr6ger P, et al. Ranking interesting sub- spaces for clustering high dimensional data [ C ]. In 7th European Conference on Principles and Practice of Knowledge Discovery in Databases ( PKDD), Cavtat-Dubrovnik, Croatia,2003:241-252.
  • 10Frank A, Asuncion A. UCI machine learning repository [ EB/OL ]. http ://archive. ics. uci. edu/m1,2013.

共引文献9

同被引文献24

引证文献6

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部