摘要
可分离非负矩阵分解(NMF)是通过抽取数据集中的部分样本或关键主题来表示整个数据集的一种特殊NMF方法。广义可分离非负矩阵分解(GSNMF)算法是由可分离NMF扩展的算法,可以同时得到数据集中的关键样本和关键主题两类特征,使分解结果更具有可解释性,但在处理某些数据集时由于选择方法存在的缺陷,GSNMF算法只能单独选择行或列的特征,从而失去可解释性的优点。为此,引入正交约束来修正GSNMF算法的选取结果,提出一种基于正交约束的广义可分离非负矩阵分解(OGSNMF)算法,利用非负特性及正交约束的特点,限制迭代过程中关于行和列的迭代矩阵,确保得到行和列的特征,并获取更加精确的分解结果。在此基础上,引入相对近似误差作为实验指标,结合分解结果的秩在行与列上的分配作为实验评判标准。实验结果表明,与原有算法相比,OGSNMF算法在处理数据集时,相对近似误差提高了1~3个百分点,说明在分解过程中损失的信息更少,确保能够获取到行和列的特征,得到更具有可解释性的分解结果。
Separable Nonnegative Matrix Factorization(NMF)is a special NMF method used to represent an entire dataset by extracting partial samples or key topics from the dataset.Generalized Separable Nonnegative Matrix Factorization(GSNMF)is an extended algorithm of separable NMF that can simultaneously obtain two types of features of key samples and key topics in the dataset to enable the decomposition results to more interpretable.However,it is shown that when GSNMF processes certain datasets,because of its defective selection algorithm,the decomposition process select only the characteristics of the row or column,thus losing the advantage of interpretability.The orthogonal constraints are introduced to correct the selection results of GSNMF and a algorithm of GSNMF with Orthogonal constraints(OGSNMF)is derived.OGSNMF uses the characteristics of nonnegative and orthogonal constraints to limit the iteration matrix of rows and columns in the iterative process.This ensures that the features of rows and columns are obtained simultaneously,thus yielding more accurate and thorough decomposition results.To verify the effectiveness of the algorithm,the relative approximation error is introduced as the experimental index,and the distribution of rank on rows and columns of the decomposition results is used as the experimental evaluation standard.A comparison of the experimental results shows that the OGSNMF improves the relative approximation error by 1 to 3 percentage points over that of the original algorithm,indicating that the loss of information under the decomposition process is less.The algorithm also ensures that the characteristics of rows and columns can be obtained,making the decomposition results more interpretable.
作者
陈君航
杨祖元
刘名扬
李陵江
CHEN Junhang;YANG Zuyuan;LIU Mingyang;LI Lingjiang(Guangdong Key Laboratory of IoT Information Technology,School of Automation,Guangdong University of Technology,Guangzhou 510006,China;Guangdong-Hong Kong-Macao Joint Laboratory for Smart Discrete Manufacturing,Guangzhou 510006,China;Key Laboratory of Intelligence Detection and Manufacturing Internet of Things,Ministry of Education,Guangzhou 510006,China.)
出处
《计算机工程》
CAS
CSCD
北大核心
2023年第8期46-53,共8页
Computer Engineering
基金
广东省重点领域研发计划(2019B010154002)。
关键词
降维
非负矩阵分解
广义可分离非负矩阵分解
正交约束
数据表示
dimension reduction
Nonnegative Matrix Factorization(NMF)
Generalized Separable Nonnegative Matrix Factorization(GSNMF)
orthogonal constraint
data representation