分布式低秩张量子空间聚类算法

Distributed Low-rank Tensor Subspace Clustering Algorithm

下载PDF

导出

摘要现有基于低秩表示的子空间聚类算法(LRR)无法有效地处理大规模数据,聚类正确率不高,以及分布式低秩子空间聚类算法(DFC-LRR)不能直接处理高维数据.为此,文中提出了一种基于张量和分布式方法的子空间聚类算法.该算法首先将高维数据视为张量,在数据的自表示中引入张量乘法,从而将LRR子空间聚类算法拓展到高维数据;然后采用分布式并行计算得到低秩表示的系数张量,并对系数张量的每个侧面切片稀疏化,得到稀疏相似度矩阵.在公开数据集Extended YaleB、COIL20和UCSD上与DFC-LRR的对比实验结果表明,文中算法能有效地提高聚类正确率,且分布式计算能明显降低算法的运行时间. Subspace clustering algorithm based on low-rank representation(LRR)cannot handle large-scale data effectively,and distributed low-rank subspace clustering algorithm(DFC-LRR)cannot handle the high-dimensional data directly.To solve this issue,a distributed low-rank subspace clustering algorithm based on tensor and distributed computing was proposed.The proposed method firstly considered high-dimensional data as tensor and extended LRR subspace clustering algorithm to high-dimensional data by introducing tensorial multiplication into self representation of data.Then the low-rank coefficient tensor was obtained through the distributed parallel computing,and get the sparse similarity matrix by sparing every lateral slices of the coefficient tensor.Experimental results on the Extended Yale B,COIL20 and UCSD datasets show that the proposed algorithm outperforms DFC-LRR in clustering accuracy,and distributed computing can reduce the running time obviously.

作者刘小兰潘凎易淼李植鹏 LIU Xiaolan;PAN Gan;YI Miao;LI Zhipeng(School of Mathematics,South China University of Technology,Guangzhou 510640,Guangdong,China;State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,Jiangsu,China;College of Physical Science and Technology,Yichun University,Yichun 336000,Jiangxi,China;School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China)

机构地区华南理工大学数学学院南京大学计算机软件新技术国家重点实验室宜春学院物理科学与工程技术学院华南理工大学计算机科学与工程学院

出处《华南理工大学学报（自然科学版）》 EI CAS CSCD 北大核心 2019年第8期77-83,95,共8页 Journal of South China University of Technology(Natural Science Edition)

基金国家自然科学基金资助项目(61502175,61273295) 广东省自然科学基金资助项目(2016A030313545) 广州市科技计划项目(201607010069)~~

关键词低秩表示子空间聚类分布式计算张量 low-rank representation subspace clustering distributed computing tensor

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1王卫卫,李小平,冯象初,王斯琪.稀疏子空间聚类综述[J].自动化学报,2015,41(8):1373-1384. 被引量：79
2许凯,吴小俊,尹贺峰.基于分布式低秩表示的子空间聚类算法[J].计算机研究与发展,2016,53(7):1605-1611. 被引量：5

二级参考文献102

1Donoho D L. High-dimensional data analysis: the curses and blessings of dimensionality. American Mathematical Society Math Challenges Lecture, 2000. 1-32.
2Parsons L, Haque E, Liu H. Subspace clustering for high dimensional data: a review. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 90-105.
3Vidal R. Subspace clustering. IEEE Signal Processing Magazine, 2011, 28(2): 52-68.
4Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD Record, 1998,27(2): 94-105.
5Lu L, Vidal R. Combined central and subspace clustering for computer vision applications. In: Proceedings of the 23rd International Conference on Machine Learning (ICML). Pittsburgh, USA: ACM, 2006. 593-600.
6Hong W, Wright J, Huang K, Ma Y. Multi-scale hybrid linear models for lossy image representation. IEEE Transactions on Image Processing, 2006, 15(12): 3655-3671.
7Yang A Y, Wright J, Ma Y, Sastry S. Unsupervised segmentation of natural images via lossy data compression. Computer Vision and Image Understanding, 2008, 110(2): 212-225.
8Vidal R, Soatto S, Ma Y, Sastry S. An algebraic geometric approach to the identification of a class of linear hybrid systems. In: Proceedings of the 42nd IEEE Conference on Decision and Control. Maui, HI, USA: IEEE, 2003. 167-172.
9Boult T E, Brown L G. Factorization-based segmentation of motions. In: Proceedings of the 1991 IEEE Workshop on Visual Motion. Princeton, NJ: IEEE, 1991. 179-186.
10Wu Y, Zhang Z Y, Huang T S, Lin J Y. Multibody grouping via orthogonal subspace decomposition. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). Kauai, HI, USA: IEEE, 2001. 2: 252-257.

共引文献81

1吴信东,何进,陆汝钤,郑南宁.从大数据到大知识：HACE＋BigKE[J].自动化学报,2016,42(7):965-982. 被引量：50
2刘展杰,陈晓云.局部子空间聚类[J].自动化学报,2016,42(8):1238-1247. 被引量：14
3王帅,孙华燕,叶新,邢强,刘田间.几种典型数据类型的谱聚类方法研究[J].数学的实践与认识,2016,46(14):173-179. 被引量：2
4呙星,钱惟贤,许春根,谢建春,许孟,顾国华,陈钱,陈浩,李昕.数据的多流形结构分析[J].数学的实践与认识,2016,46(14):200-207.
5岳温川,王卫卫,李小平.基于加权稀疏子空间聚类多特征融合图像分割[J].系统工程与电子技术,2016,38(9):2184-2191. 被引量：7
6褚睿鸿,王红军,杨燕,李天瑞.基于密度峰值的聚类集成[J].自动化学报,2016,42(9):1401-1412. 被引量：15
7张涛,唐振民,吕建勇.一种基于低秩表示的子空间聚类改进算法[J].电子与信息学报,2016,38(11):2811-2818. 被引量：25
8林大华,杨利锋,邓振云,李永钢,罗噭.稀疏样本自表达子空间聚类算法[J].智能系统学报,2016,11(5):696-702. 被引量：4
9张雪霁,张廷利,张志鸿.一种期货市场关联交易行为检测的聚类方法[J].计算机应用与软件,2016,33(12):275-278. 被引量：3
10李辉,陈晓云.基于最小二乘回归的分块加权子空间聚类[J].模式识别与人工智能,2016,29(12):1114-1121. 被引量：1

1罗养霞,马迪,常言说.PID参数调节的谱多流形聚类算法研究[J].计算机科学与探索,2019,13(8):1360-1369. 被引量：2
2王和平,邹彪,杨国柱,操松元.基于激光点云的输电线路三维可视化平台研究[J].能源与环境,2019,0(4):20-21. 被引量：5
3殷雅俊.坐标变换系数张量观与杂交张量概念分析[J].力学与实践,2019,41(1):1-9. 被引量：3
4曹兆龙,张华,张跃伟.基于云计算的系统基础架构设计与实现[J].电脑编程技巧与维护,2019,0(8):4-8. 被引量：3
5葛悦涛,尹晓桐.边缘计算的发展趋势综述[J].无人系统技术,2019,0(2):60-64. 被引量：5
6郭蕾蕾,俞璐,段国仑,陶性留.基于AP聚类的多特征融合方法[J].计算机技术与发展,2019,29(8):47-52. 被引量：3
7冯晓荣,瞿国庆.基于深度学习与随机森林的高维数据特征选择[J].计算机工程与设计,2019,40(9):2494-2501. 被引量：16
8李博嘉,张仰森,陈若愚.一种可指定分布的海量数据生成方法[J].计算机科学,2019,46(8):56-63.
9杨燕琳,冶忠林,赵海兴,孟磊.基于高阶近似的链路预测算法[J].计算机应用,2019,39(8):2366-2373. 被引量：3
10林麒麟,俸世洲.基于MR-CPGA-Elman的电力负荷并行预测研究[J].数字技术与应用,2019,37(6):101-102.

华南理工大学学报（自然科学版）

2019年第8期

浏览历史

内容加载中请稍等...

分布式低秩张量子空间聚类算法

参考文献2

二级参考文献102

共引文献81

相关作者

相关机构

相关主题

浏览历史