摘要
在基于聚类的DBSCAN离群点检测算法中,存在参数Eps的不确定性和全局统一性问题.因此,本文首先提出了一种基于多目标优化的自适应DBSCAN离群点检测算法,根据不同数据集的特点,通过NSGA-II优化算法为数据集中的每个数据自适应地求解一个最优Eps,不仅避免了人为经验设置参数的不足,还解决了全局参数带来的聚类不精确问题.其次,通过基于Eps的LOF算法进行离群点检测,减少了计算量.最后,通过在不同数据集下的实验对比,结果表明本文提出的算法对于检测离群点有更高的准确率.
In the DBSCAN outlier detection algorithm based on clustering,there are problems of uncertainty and global uniformity of the parameter Eps.Therefore,this paper first proposes an adaptive DBSCAN outlier detection algorithm based on multi-objective optimization.According to the characteristics of the data set,the NSGA-II optimization algorithm is used to adaptively solve an optimal Eps for each data in the data set,which not only avoids the insufficiency of parameter setting by human experience,but also solves the problem of clustering inaccuracy caused by global parameters.Secondly,the outlier detection is performed through the LOF algorithm based on Eps,which reduces the amount of calculation.Finally,through experimental comparisons under different data sets,the results show that the algorithm proposed in this paper has a higher accuracy for detecting outliers.
作者
黄剑柔
王茜
蔡星娟
李建伟
HUANG Jian-rou;WANG Qian;CAI Xing-juan;LI Jian-wei(College of Computer Science and Technology,Taiyuan University of Technology,Taiyuan 030024,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2022年第4期702-706,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金青年科学基金项目(61806138)资助
山西省重点研发计划项目(201903D421048)资助。