摘要
针对密度峰值聚类算法中,样本局部密度截断距离需主观选择和样本分配策略的误差扩散问题,提出自适应截断距离和构造流形距离优化样本分配的改进型密度峰值聚类算法。该算法首先使用样本K近邻自适应的选取各点的截断距离,即在样本密度大的点,选用大截断距离,准确选取类簇中心,在样本密度小的点,选用小截断距离,判别离群点。其次对于剩余样本通过样本的连接路径构造流形距离,优化样本分配策略。最后选取人工数据集进行聚类分析算法实验,与传统的密度峰值聚类算法进行实验对比,验证所提改进算法对聚类中心选取和样本分配的准确性。
Aiming at the shortcomings of the subjective selection of the cutoff distance and the sample allocation strategy in the sample local density of the density peaks fast search clustering algorithm,an improved density peaks clustering algorithm which is adaptive to the cutoff distance and Manifold distance optimization is proposed.The algorithm uses the sample K nearest neighbor adaptive selection cutoff distance.In the place where the sample density is large,the large cutoff distance is selected to accurately select the cluster center.For the remaining samples,an optimized sample allocation strategy of manifold distance was adopted.Artificial data sets were selected for clustering analysis in the algorithm verification experiment,and the experiment was compared with the traditional peak density clustering algorithm to verify the accuracy of the improved algorithm in clustering center selection and sample allocation.
作者
张志壮
高文华
石慧
董增寿
ZHANG Zhi-zhuang;GAO Wen-hua;SHI Hui;DONG Zeng-shou(School of Electronic and Information Engineering,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处
《太原科技大学学报》
2023年第2期91-96,共6页
Journal of Taiyuan University of Science and Technology
基金
国家自然科学基金青年科学基金(61703297)
山西省重点研发计划(201903D321012
201903D121023)
山西省自然科学基金(201801D121166
201901D111264)。
关键词
密度峰值聚类
聚类中心
自适应截断距离
流形距离
density peak clustering
cluster center
adaptive cutoff distance
manifold distance