摘要
针对传统粗糙集的离群点检测方法难以处理数值型属性数据的问题,提出基于邻域粗糙隶属函数的离群点检测方法,其适用于数据包括数值型、符号型与混合型。基于混合型距离与自适应半径,定义邻域粗糙隶属函数刻画对象离群程度,构建邻域粗糙离群因子实施离群点检测,设计相应的离群点检测算法NRMFOD。UCI数据对比实验结果表明,NRMFOD算法具有有效性,优于3种常用检测算法(RMF、RBD、DIS算法)。
The outlier detection method based on classical rough sets is difficult to deal with numerical attribute data.Aiming at this problem,the outlier detection based on neighborhood rough membership functions was proposed to effectively apply to the numerical,symbolic and hybrid attribute data.Based on the mixed distance and adaptive radius,the neighborhood membership function was defined to describe the object’s outlier degree,the neighborhood outlier factors were constructed to implement the outlier detection,and the corresponding outlier detection algorithm NRMFOD was designed.According to comparative experiments of UCI data,the NRMFOD algorithm is effective and is superior to three usual detection algorithms(i.e.,RMF,RBD,DIS).
作者
杨晓玲
张贤勇
YANG Xiao-ling;ZHANG Xian-yong(College of Mathematics and Software Science,Sichuan Normal University,Chengdu 610066,China;Institute of Intelligent Information and Quantum Information,Sichuan Normal University,Chengdu 610066,China)
出处
《计算机工程与设计》
北大核心
2019年第2期533-539,共7页
Computer Engineering and Design
基金
国家自然科学基金项目(61673285
61203285)
四川省青年科技基金项目(2017JQ0046)
四川省教育厅科研基金项目(15ZB0028)
关键词
离群点检测
邻域粗糙集
粗糙隶属函数
混合型属性数据
数据挖掘
outlier detection
neighborhood rough set
rough membership function
hybrid attribute data
data mining