摘要
离群点检测具有广泛应用.传统粗糙集的离群点检测方法不能有效处理数值型属性数据,故提出邻域粗糙集中基于序列的混合型属性离群点检测方法.该方法采用每个属性值的均匀性来构建属性序列,以此定义属性集序列并构建邻域类序列;进而,通过分析邻域类序列中对象的变化情况来检测离群点,并设计出相应的离群点检测算法(Sequence-based Mixed Attribute Outlier Detection,SM AOD),该算法在计算单属性邻域覆盖的方式上改进了传统的逐一比较计算模式.最后,在UCI标准数据集上与主要离群点检测方法进行实验比较与分析,结果表明所提方法的有效性.
Outlier detection has extensive applications. However,the outlier detection method based on classical rough sets cannot effectively deal with the numerical attribute data,and thus a new method of mixed attribute outlier detection is proposed based on sequence.The method constructs the attribute sequence by the variance of each attribute value,and the sequence attribute set is defined to construct the neighborhood class sequence. Then,the outlier is detected by analyzing the object change in the neighborhood class sequence,and the corresponding outlier detection algorithm( Sequence-based Mixed Attribute Outlier Detection,SMOAD) is designed,to improve the traditional one-by-one calculation pattern when computing neighborhood covering of a single attribute. Finally,the experiments are compared with main outlier detection methods via the UCI standard data sets,and the results show the effectiveness of the proposed method.
作者
袁钟
张贤勇
冯山
YUAN Zhong;ZHANG Xian-yong;FENG Shan(College of Mathematics and Software Science, Sichuan Normal University, Chengdu 610068 ,China;Institute of Intelligent Information and Quantum Information,Sichuan Normal University ,Chengdu 610068, China)
出处
《小型微型计算机系统》
CSCD
北大核心
2018年第6期1317-1322,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61673285
61203285)资助
四川省青年科技基金项目(2017JQ0046)资助
四川省教育厅科研基金项目(15ZB0029)资助
关键词
离群点检测
邻域粗糙集
序列
均匀性
混合型属性
数据挖掘
outlier detection
neighborhood rough sets
sequence
variance
mixed attribute
data mining