期刊文献+

一种基于邻域系统密度差异度量的离群点检测算法 被引量:11

Outlier detection algorithm based on neighborhood system density difference measurement
下载PDF
导出
摘要 针对离群点检测算法LOF在高维离散分布数据集中检测精度较低及参数敏感性较高的问题,提出了基于邻域系统密度差异度量的离群点检测NSD(neighborhood system density difference)算法。相较于传统基于密度的离群点检测方法,NSD算法引入了截取距离的概念。首先计算数据集中对象在截取距离内的邻居点个数;其次计算对象的邻域系统密度;然后将对象的密度与它邻居的密度进行比较,判定目标对象与其邻居趋向于同一簇的程度;最后输出最可能是离群点的对象。将NSD算法与LOF、LDOF、CBOF算法在真实数据集与合成数据集中对比实验发现,NSD算法具有较高的检测准确率和执行效率以及较低的参数敏感性,证明了NSD算法是有效可行的。 LOF is a famous algorithm for outlier detection,and it has lower detection accuracy and higher parameter sensitivity on high-dimensional discrete distribution datasets.Aiming at these problems,this paper proposed a neighborhood system density difference(NSD)algorithm based on density difference measurement of neighborhood systems.Compared with the traditional density-based methods,NSD algorithm proposed and introduced the concept of intercept distance.Firstly,it calculated the number of neighbors of an object within the intercept distance on dataset.Then,it computed the density of neighborhood system.After that,it estimated the degree of tending to the same cluster by comparing the density between the object and its neighbors.Finally,it output the objects which closed to outlier with maximum likelihood.Experiments with NSD,LOF,LDOF,CBOF algorithms carried out on the real-world dataset and synthetic dataset,show that the NSD algorithm performs with higher detection accuracy and execution efficiency,while with lower parameter sensitivity.
作者 杜旭升 于炯 陈嘉颖 王跃飞 蒲勇霖 叶乐乐 Du Xusheng;Yu Jiong;Chen Jiaying;Wang Yuefei;Pu Yonglin;Ye Lele(School of Software,Xinjiang University,Urumqi 830008,China;School of Information Science&Engineering,Xinjiang University,Urumqi 830008,China;School of Software,Xi’an Jiaotong University,Xi’an 710049,China)
出处 《计算机应用研究》 CSCD 北大核心 2020年第7期1969-1973,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(61862060,61462079,61562086,61562078)。
关键词 数据挖掘 离群点检测 基于密度 LOF LDOF CBOF data mining outlier detection density-based LOF(local outlier factor) LDOF(local distance-based outlier factor) CBOF(cohesiveness-based outlier factor)
  • 相关文献

参考文献4

二级参考文献89

  • 1文俊浩,吴中福,吴红艳.空间孤立点检测[J].计算机科学,2006,33(5):186-187. 被引量:5
  • 2杨宜东,孙志挥,朱玉全,杨明,张柏礼.基于动态网格的数据流离群点快速检测算法[J].软件学报,2006,17(8):1796-1803. 被引量:22
  • 3汪加才,张金城,江效尧.一种有效的可视化孤立点发现与预测新途径[J].计算机科学,2007,34(6):200-203. 被引量:5
  • 4薛安荣,鞠时光.基于空间约束的离群点挖掘[J].计算机科学,2007,34(6):207-209. 被引量:12
  • 5赵科平,周水庚,关佶红,等.一种新的离群数据对象发现方法∥中国人工智能学会第10届全国学术年会论文集.北京:北京邮电大学出版社,2003.
  • 6Aggarwal C C, Yu P. Outlier detection for high dimensional dataft Proc. of the ACM SIGMOD International Conference on Management of Data. Santa Barbara, 2001:37-47
  • 7Angiulli F, Pizzuti C. Outlier Mining in Large High Dimensional Data Sets. IEEE Trans. Knowledge and Data Eng. , 2005, 2 (17) :203-215
  • 8Angiulli F, Basta S, Pizzuti C. Distance-based detection and prediction of outlier. IEEE Trans. Knowledge and Data Eng. , 2006, 2(18): 145-160
  • 9Aggarwal C C. Re - designing Distance Functions and Distance - based Applications for High Dimensional Data. SIGMOD Record Date, 2001, 30(1):13-18
  • 10Yu Dantong, Gholamhosein S, Zhang Aidong. FindOut: Finding Outliers in Very Large Datasets. Knowledge and Information Systems, 2002,4 (4) : 387-412

共引文献107

同被引文献83

引证文献11

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部