摘要
最小二乘回归子空间聚类算法存在对数据中噪声敏感、模型对数据结构信息约束不充分、没有考虑数据非线性关系等问题。针对这些问题,提出一种基于log函数的改进算法。使用L-(2,log)范数代替Frobenius范数约束残差项,提高算法的鲁棒性;使用logdet范数代替Frobenius范数约束表达矩阵,加强表达矩阵的低秩性;利用核方法处理数据,增强算法对数据非线性关系的捕捉能力,进而提高聚类的准确率。分别在人脸、手写数字、物体3种类别的数据集上与多个经典聚类算法进行对比试验,试验结果表明,该算法在精准度、标准化互信息、纯度3个聚类评价指标上优于对比算法,具有良好的聚类效果。
There were some problems in the least square regression subspace clustering algorithm,such as sensitive to noise in the data,insufficient constraints on the data structure information,not considering the nonlinear relationship of the data and so on.To solve these problems,an improved algorithm based on log function was proposed.L-(2,log)norm was used instead of Frobenius norm to constrain the residual term and improve the robustness of the algorithm.Logdet norm was used instead of the Frobenius norm to constrain the expression matrix to enhance the low rank.The kernel method was used to process the data to enhance the algorithm's ability to capture the nonlinear relationships of the data and thus improve the clustering accuracy.Compared with several classical clustering algorithms on the datasets of face,handwritten digits and objects,the proposed algorithm was higher than the comparison algorithm in three clustering evaluation indexes:accuracy,normalized mutual information and purity.Experimental results showed that the proposed algorithm had a good clustering effect.
作者
张鑫
费可可
ZHANG Xin;FEI Keke(College of Computer Science&Technology,Qingdao University,Qingdao 266071,Shandong,China)
出处
《山东大学学报(工学版)》
CSCD
北大核心
2023年第6期26-34,46,共10页
Journal of Shandong University(Engineering Science)
关键词
子空间聚类
谱聚类
最小二乘回归
核方法
范数
subspace clustering
spectral clustering
least squares regression
kernel method
norm