期刊文献+

自适应截断距离与样本分配的密度峰值聚类算法

Density Peak Clustering Algorithm Based on Adaptive Cutoff Distance and Sample Allocation
下载PDF
导出
摘要 针对密度峰值聚类算法中,样本局部密度截断距离需主观选择和样本分配策略的误差扩散问题,提出自适应截断距离和构造流形距离优化样本分配的改进型密度峰值聚类算法。该算法首先使用样本K近邻自适应的选取各点的截断距离,即在样本密度大的点,选用大截断距离,准确选取类簇中心,在样本密度小的点,选用小截断距离,判别离群点。其次对于剩余样本通过样本的连接路径构造流形距离,优化样本分配策略。最后选取人工数据集进行聚类分析算法实验,与传统的密度峰值聚类算法进行实验对比,验证所提改进算法对聚类中心选取和样本分配的准确性。 Aiming at the shortcomings of the subjective selection of the cutoff distance and the sample allocation strategy in the sample local density of the density peaks fast search clustering algorithm,an improved density peaks clustering algorithm which is adaptive to the cutoff distance and Manifold distance optimization is proposed.The algorithm uses the sample K nearest neighbor adaptive selection cutoff distance.In the place where the sample density is large,the large cutoff distance is selected to accurately select the cluster center.For the remaining samples,an optimized sample allocation strategy of manifold distance was adopted.Artificial data sets were selected for clustering analysis in the algorithm verification experiment,and the experiment was compared with the traditional peak density clustering algorithm to verify the accuracy of the improved algorithm in clustering center selection and sample allocation.
作者 张志壮 高文华 石慧 董增寿 ZHANG Zhi-zhuang;GAO Wen-hua;SHI Hui;DONG Zeng-shou(School of Electronic and Information Engineering,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处 《太原科技大学学报》 2023年第2期91-96,共6页 Journal of Taiyuan University of Science and Technology
基金 国家自然科学基金青年科学基金(61703297) 山西省重点研发计划(201903D321012 201903D121023) 山西省自然科学基金(201801D121166 201901D111264)。
关键词 密度峰值聚类 聚类中心 自适应截断距离 流形距离 density peak clustering cluster center adaptive cutoff distance manifold distance
  • 相关文献

参考文献11

二级参考文献113

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:152
  • 3Han J W, Kamber M. Data Mining Concepts and Techniques. 2nd ed. New York:Elsevier Inc, 2006. 383-424.
  • 4Jain A K. Data clustering:50 years beyond K-means. Pattern Recogn Lett, 2010, 31:651-666.
  • 5Williamson B, Guyon I. Clustering:science or art?. J Mach Learn Res, 2012, 27:65-80.
  • 6Frey B J, Dueck D. Clustering by passing messages between data points. Science, 2007, 315:972-976.
  • 7Rodri?uez A, Laio A. Clustering by fast search and find of density peaks. Science, 2014, 344:1492-1496.
  • 8Xu R, Wunsch D. Survey of clustering algorithms. IEEE Trans Neural Netw Learn Syst, 2005, 16:645-678.
  • 9McQueen J. Some methods for classification and analysis of multivariate observations. In:Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Los Angeles:University of California, 1967. 281-297.
  • 10Likas A, Vlassis N, Verbeek J J. The global K-means clustering algorithm. Pattern Recogn, 2003, 36:451-464.

共引文献226

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部