摘要
在处理数据特征提取问题时,已有的基于非负矩阵分解的不完整多视角聚类算法对局部特征的提取不够准确.针对此问题,文中提出基于正交约束的分块不完整多视角聚类(CIMVCO).利用非负矩阵分解获得所有视角的潜在特征矩阵,通过加入正交约束得到更好的局部特征.对于各个视角的缺失样本,CIMVCO给予较小的权重以减小缺失数据的影响.为了解决大规模数据的聚类问题,CIMVCO逐块处理数据以减少内存需求和处理时间.在Reuters和Digit数据集上的实验验证CIMVCO的有效性.
Existing incomplete multi-view clustering algorithms based on nonnegative matrix factorization(NMF)cannot extract local features accuratly.To solve this problem,an algorithm of chunk-by-chunk incomplete multi-view clustering based on orthogonal constraints(CIMVCO)is proposed.A potential feature matrix of all views is obtained by nonnegative matrix factorization,and orthogonal constraints are added to obtain better local features.For missing samples of each view,smaller weights are given to reduce the impact of missing data.To solve the problem of large scale data clustering,data are processed block-by-block to reduce the memory demand and processing time.Experimental results on Reuters and Digit datasets demonstrate the effectiveness of CIMVCO.
作者
姜健伟
殷俊
JIANG Jianwei;YIN Jun(College of Information Engineering,Shanghai Maritime University,Shanghai 201306)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2020年第1期41-49,共9页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.61603243)
中国博士后科学基金项目(No.2017M611503)资助~~
关键词
分块
多视角聚类
非负矩阵分解(NMF)
正交约束
Chunk-by-Chunk
Multi-view Clustering
Nonnegative Matrix Factorization(NMF)
Orthogonal Constraints