期刊文献+

自适应谱聚类算法并行实现及优化

Parallel Implementation and Optimization of Self-Tuning Spectral Clustering Algorithm
原文传递
导出
摘要 谱聚类算法是基于谱图分割理论的聚类方法,其对高维、非凸数据分布问题有很好的聚类效果。但对大规模数据问题的聚类,该方法存在着计算时间和存储空间等方面的瓶颈。本文给出了一个自适应的谱聚类并行算法,通过局部计算和异步循环通信并行方法,最大限度减少了并行谱聚类中数据通信次数,并通过计算与通信重叠策略,进一步降低了并行算法的通信开销。在并行算法实现中,将自主开发的最优预条件共轭梯度法并行求解器PLOBPCG用于谱聚类的特征降维。在中科院的"元"超级计算机上,通过对两类大规模数据聚类的测试表明,在2048核上的加速比接近线性加速,并行效率达到96%以上。 Spectral clustering algorithm is a clustering method based on spectral segmentation theory,which has a good clustering effect on high-dimensional and non-convex data distribution problems.However,there are some bottlenecks in the clustering of large scale data,such as computation time and storage space.In this paper,a parallel self-tuning spectral clustering algorithm is proposed.By means of local calculation and asynchronous loop communication,the number of data communication in parallel spectral clustering is minimized.In order to further reduce the communication overhead,a method of computing and communication overlap is used.In the parallel algorithm implementation,the self-developed locally optimal block preconditioned conjugate gradient method parallel solver PLOBPCG is applied to the feature reduction of spectral clustering.On the supercomputer"Era"of Chinese Academy of Sciences,the clustering test of two types of large-scale data shows that the speedup ratio of the 2048 core is nearly linear,and the parallel efficiency is above 96%.
作者 苏琳 赵永华 李瑞琳 Su Lin Zhao Yonghua Li Ruilin(University of Chinese Academy of Sciences, Beijing 100049, China Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China)
出处 《科研信息化技术与应用》 2016年第6期44-53,共10页 E-science Technology & Application
基金 数学工程与先进计算国家重点实验室开放基金(2014A03)
关键词 谱聚类 并行算法 自适应 PLOBPCG 循环通信 spectral clustering parallel algorithm self-tuning PLOBPCG loop communication
  • 相关文献

参考文献6

二级参考文献81

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2FiEDLER M. Algebraic connectivity of graphs[J]. Czechoslovak Mathematical Journal, 1973, 23(98) :298-305.
  • 3LUXBURG von U. A tutorial on spectral clustering[J]. Statistics and Computing, 2007, 17(4) :395-416.
  • 4NG A, JORDAN M, WEISS Y. On spectral clustering: analysis and an algorithm[ C ]//Advances in Neural Information Processing Systems (NIPS). Cambridge, MA: MIT Press, 2002.
  • 5SHI J, MALIK J. Normalized cuts and image segmentation[ J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8):888-905.
  • 6KANNAN R, VEMPALA S, VETFA A. On clusterings-good, bad, and spectral[C]//Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science. [ S. l. ]: IEEE Press, 2000.
  • 7HUANG L, YAN D, JORDAN M. Spectral clustering with perturbed data[ C]// Advances in Neural Information Processing Systems (NIPS). Cambridge, MA: MIT Press, 2008: 705- 712.
  • 8CHI Y, SONG X, ZHOU D. Evolutionary spectral clustering by incorporating temporal smoothness[ C]// Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Jose, California: ACM Press, 2007: 153-162.
  • 9ZELNIK MANOR L, PERONA P. Self-tuning spectral clustering[ C]// Advances in Neural Information Processing Systems (NIPS). Cambridge, MA: MIT Press, 2004.
  • 10DHILLON I, GUAN Y, KULIS B. Kernel k-means, spectral clustering and normalized cuts[C]//Proceedings of the 10th International Conference on mining. Seattle, WA, USA: Knowledge Discovery and Data ACM Press, 2004.

共引文献86

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部