期刊文献+

基于密度峰值剪枝后的最短路径聚类算法 被引量:6

Clustering by Pruning Paths Based on Shortest Paths from Density Peaks
下载PDF
导出
摘要 聚类是通过数据标签或者属性,将一系列经验数据按照相似性或者相近性进行归类.基于密度属性展开的聚类算法,主要聚焦在聚类中心的确定和剩余点如何分配的问题上展开讨论.针对基于密度峰值的可训练最短路径算法,通过密度峰值确定聚类中心,提出使用截断阈值、对路径图进行剪枝的算法改进.然后基于最短路径法对剩余点进行全局分配.实验结果证明,在保持聚类精度的同时,有效地提升了算法执行效率. Clustering is to classify multiple empirical data according to their similarity or proximity based on data labels and properties.For the clustering algorithm based on the density peaks,it mainly focuses on the determination of the clustering center and how to allocate the remaining points.In this paper,according to a trainable clustering algorithm based on shortest paths to density peaks,the clustering center is determined by the density peaks.We propose that using a cutoff threshold and pruning the path graph to improve the algorithm.The remaining points are allocated globally based on the shortest path method.It is proved that the algorithm can significantly improve the efficiency while maintaining the clustering accuracy.
作者 胡恩祥 汪春雨 潘美芹 HU Enxiang;WANG Chunyu;PAN Meiqin(School of Business and Management,Shanghai International Studies University,Shanghai 201600,China;School of Computer Science and Technology,East China Normal University,Shanghai 200062,China)
出处 《应用科学学报》 CAS CSCD 北大核心 2020年第5期792-802,共11页 Journal of Applied Sciences
基金 上海外国语大学规划项目基金(No.2019114009)资助。
关键词 聚类 密度峰值 最短路径法 路径剪枝 clustering density peak shortest path method pruning path
  • 相关文献

参考文献7

二级参考文献89

  • 1..http://www.ics.uci.edu/mleam/MLSununary.html,.
  • 2MacQueen J.Some methods for classification and analysis of multivariate observations[C]//LeCam L,Neyman J,eds.Proceedings of the Fifth Berkeley Symposium on Mathematics,Statistics and Probability.Berkeley:University of California Press,1967:281-297.
  • 3Leonard Kaufman,Peter J Rousseenw.Finding groups in data:An introduction to cluster analysis[M].New York:Wiley Press,2005.
  • 4Tan P N,Steinbach M,Kumar V 著,范明,范宏建,等译,数据挖掘导论(Introduction to DataMining).北京:人民邮电出版社,2006.
  • 5Ester M,Kriegel H P,Sander J.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Simoudis E,Hart JW,Fayyad UM,eds.Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland:AAAI Press,1996:226-231.
  • 6Ankerst M,Breunig M M,Kriegcl H P.OPTICS:ordering points to identify the clustering structure[C]//Alex Dells,Christns Faloutscs,Shahram Ghandeharizadeh eds.Proceedings of the ACM SIGMOD'99 lnt Conf on Management of Data.Philadelphia Pennsylvania:ACM Press,1999:49-60.
  • 7Hinneburg A,Keim D A.An efficient approach to clustering in large multimedia databases with noise[C]//Rakesh Agrawal,Paul Stolorz,eds.Proceedings of the 4th lnt Conf on Knowledge Discovery and Data Mining.New York:AAAI Press,1998:58-65.
  • 8Feng P J,C,e L D.Adaptive DBSCAN-bused algorithm for constellation reconstruction and modulation identification[C]//Keyun Tang,Dayong Lio,eds.Proceedings of Radio Science Conference 2004.Beijing:Pub House of Electronics Industry,2004:177-180.
  • 9Halkidi M,Vazirgiannis M.Clustering validity assessment:finding the optimal partitioning of a data set[C]//Nick Cerenne,Tsau Young Lin,Xindeng Wu eds.Prueecdings of the 2001 IEEE International Conference on Data Mining.California:IEEE Computer Society,2001:187-194.
  • 10Yue S H,Li P,Guo J D,et al.A statistical information-based clustering approach in distance space[J].Journal of Zhejiang University Science,2005,6A(1):71-78.

共引文献437

同被引文献59

引证文献6

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部