摘要
针对CURE算法聚类过程中对噪音点敏感,随机抽样存在局限性,以及对收缩因子敏感且在大型数据集聚类方面效率欠佳的问题,提出一种基于MeanShift核函数平移模型DBSCAN算法改进的CURE算法,即DCNDA(density-based CURE noise detection clustering algorithm)。自适应参数的DBSCAN算法提高初步聚类精度和可靠性,引入质心公式改进CURE算法,避免受收缩因子影响,降低时间复杂度,提高算法全局收敛性和可靠性。仿真结果表明,DCNDA算法在时间复杂度、聚类准确率、异常值检测效率方面优于改进分区CURE算法和PDBSCAN算法。
Aiming at the problem of noise point sensitivity,random sampling limitation and sensitivity to shrink factor and inefficiency in large data set clustering in CURE clustering process,an improved CURE based on MeanShift kernel function translation model DBSCAN algorithm was proposed,namely DCNDA(density-based CURE noise detection clustering algorithm).The adaptive parameter DBSCAN algorithm improved the initial clustering accuracy and reliability.The centroid formula was introduced to improve the CURE algorithm,avoiding the shrinkage factor,reducing the time complexity and improving the global convergence and reliability of the algorithm.The simulation results show that the DCNDA algorithm outperforms the improved partition CURE algorithm and PDBSCAN algorithm in terms of time complexity,clustering accuracy and outlier detection efficiency.
作者
蒋华
季丰
王鑫
王慧娇
JIANG Hua;JI Feng;WANG Xin;WANG Hui-jiao(School of Computer Science and Information Security,Guilin University of Electronic Technology,Guilin 541000,China)
出处
《计算机工程与设计》
北大核心
2018年第11期3425-3430,3485,共7页
Computer Engineering and Design
基金
2016广西高校中青年教师基础能力提升基金项目(ky2016YB150)
桂林电子科技大学研究生教育创新计划基金项目(2017YJCX48)