期刊文献+

基于约束优化传播的改进大规模数据半监督式谱聚类算法 被引量:3

Constrain optimal propagation-based improved semi-supervised spectral clustering algorithm for large-scale data
下载PDF
导出
摘要 针对传统谱聚类算法在聚类过程中所出现的高计算复杂度、噪声敏感,以及聚类簇形态偏斜等问题,结合当前大规模数据聚类的特点与需求,建立基于约束优化传播的改进大规模数据半监督式谱聚类模型。该模型利用先验成对点约束信息构建微型相似性矩阵,在此基础上采用Gabow算法提取该微型相似性矩阵所对应连通图的各强连通分支,继而提出面向各强连通分支的新型约束优化传播算法以获取整个数据集的点对相似度,最后通过奇异值分解并运用加速K-means算法获得大规模数据的聚类结果。在多个标准测试数据集上的实验表明,相比于该领域其他前期研究成果,该聚类模型具有更高的聚类准确率和更低的计算复杂度,更适合大规模数据的聚类应用。 Focusing on the problem of high computational complexity,noise sensitivity and the shape deviation of cluster in the clustering process of traditional spectral clustering,and combining the characteristics with the need of current large-scale data clustering,this article established the semi-supervised of large-scale data model based on constrained optimal propagation.First,it constructed the micro similarity matrix by using prior dotted pair constraint information.On this basis,it used the Gabow algorithm to extract the micro similarity matrix corresponding connected graph of each strongly connected component.Then,it proposed a new constrained optimization propagation algorithm for each strongly connected component to obtained the similarity of the point of the whole data set.Finally,it could obtain the clustering results of large scale data by using the singular value decomposition and the accelerated K-means algorithm.Experiments on multiple standard testing data sets show that compared with other previous research results in this field,the proposed clustering model has higher clustering accuracy and lower computation complexity and is more suitable for large-scale data clustering applications.
作者 徐达宇 郁莹珺 冯海林 张旭尧 Xu Dayu;Yu Yingjun;Feng Hailin;Zhang Xuyao(School of Information Engineering,Zhejiang A&F University,Hangzhou 311300,China;Sunyard System Engineering Co.,Ltd.,Hangzhou 310053,China)
出处 《计算机应用研究》 CSCD 北大核心 2018年第5期1325-1330,共6页 Application Research of Computers
基金 国家自然科学基金资助项目(61272313) 浙江省自然科学基金项目(LQ17G010003) 浙江省重大科技专项项目(2015C03008)
关键词 谱聚类 大规模数据 点对约束 相似性传播 奇异值分解 spectral clustering large-scale data pairwise constraint affinity propagation singular value decomposition
  • 相关文献

参考文献2

二级参考文献74

  • 1韩彦彬.高维正定核的本征值[J].数学学报(中文版),1993,36(2):188-194. 被引量:4
  • 2田铮,李小斌,句彦伟.谱聚类的扰动分析[J].中国科学(E辑),2007,37(4):527-543. 被引量:33
  • 3Fiedler M. Algebraic Connectivity of Graphs. Czechoslovak Mathe-matical Journal, 1973, 23 (98) : 298-305.
  • 4Hendrickson B, Leland R. An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations. SIAM Journal on Sci-entific Computing, 1995, 16(2) : 452-469.
  • 5Hagen L, Kahng A B. New Spectral Methods for Ratio Cut Partitio-ning and Clustering. IEEE Trans on Computer-Aided Design, 1992, 11(9) : 1074-1085.
  • 6Dhillon Spectral national (KDD) I S. Co-Clustering Documents and Words Using Bipartite Graph Partitioning// Proc of the 7th ACM SIGKDD Inter-Conference on Knowledge Discovery and Data Mining San Francisco, USA, 2001 : 269-274.
  • 7Ding C, He Xiaofeng, Zha Hongyuan, et al. Unsupervised Learn-ing: Self-Aggregation in Scaled Principal Component Space//Proc of the 6th European Conference on Principles of Data Mining and Knowledge Discovery. Helsinki, Finland, 2002: 112-124.
  • 8Shi Jiaobo, Malik J. Normalized Cuts and Image Segmentation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2000, 22 (8) : 888-905.
  • 9Ng A Y, Jordan M I, Weiss Y. On Spectral Clustering: Analysis and an Algorithm// Dietterieh T, Beeker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA : MIT Press, 2002, XIV : 849-856.
  • 10Fowlkes C, Belongie S, Chung F, et al. Spectral Grouping Using the Nystrom Method. IEEE Trans on Pattern Analysis and Machine Intelligence, 2004, 26 (2) : 214-225.

共引文献32

同被引文献14

引证文献3

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部