Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking sto...Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.展开更多
Most existing network representation learning algorithms focus on network structures for learning.However,network structure is only one kind of view and feature for various networks,and it cannot fully reflect all cha...Most existing network representation learning algorithms focus on network structures for learning.However,network structure is only one kind of view and feature for various networks,and it cannot fully reflect all characteristics of networks.In fact,network vertices usually contain rich text information,which can be well utilized to learn text-enhanced network representations.Meanwhile,Matrix-Forest Index(MFI)has shown its high effectiveness and stability in link prediction tasks compared with other algorithms of link prediction.Both MFI and Inductive Matrix Completion(IMC)are not well applied with algorithmic frameworks of typical representation learning methods.Therefore,we proposed a novel semi-supervised algorithm,tri-party deep network representation learning using inductive matrix completion(TDNR).Based on inductive matrix completion algorithm,TDNR incorporates text features,the link certainty degrees of existing edges and the future link probabilities of non-existing edges into network representations.The experimental results demonstrated that TFNR outperforms other baselines on three real-world datasets.The visualizations of TDNR show that proposed algorithm is more discriminative than other unsupervised approaches.展开更多
The problem of high-precision indoor positioning in the 5G era has attracted more and more attention.A fingerprint location method based on matrix completion(MC-FPL)is proposed for 5G ultradense networks to overcome t...The problem of high-precision indoor positioning in the 5G era has attracted more and more attention.A fingerprint location method based on matrix completion(MC-FPL)is proposed for 5G ultradense networks to overcome the high costs of traditional fingerprint database construction and matching algorithms.First,a partial fingerprint database constructed and the accelerated proximal gradient algorithm is used to fill the partial fingerprint database to construct a full fingerprint database.Second,a fingerprint database division method based on the strongest received signal strength indicator is proposed,which divides the original fingerprint database into several sub-fingerprint databases.Finally,a classification weighted K-nearest neighbor fingerprint matching algorithm is proposed.The estimated coordinates of the point to be located can be obtained by fingerprint matching in a sub-fingerprint database.The simulation results show that the MC-FPL algorithm can reduce the complexity of database construction and fingerprint matching and has higher positioning accuracy compared with the traditional fingerprint algorithm.展开更多
Usually,the problem of direction-of-arrival(DOA)estimation is performed based on the assumption of uniform noise.In many applications,however,the noise across the array may be nonuniform.In this situation,the performa...Usually,the problem of direction-of-arrival(DOA)estimation is performed based on the assumption of uniform noise.In many applications,however,the noise across the array may be nonuniform.In this situation,the performance of DOA estimators may be deteriorated greatly if the non-uniformity of noise is ignored.To tackle this problem,we consider the problem of DOA es-timation in the presence of nonuniform noise by leveraging a singular value thresholding(SVT)based matrix completion method.Different from that the traditional SVT method apply fixed threshold,to improve the performance,the proposed method can obtain a more suitable threshold based on careful estimation of the signal-to-noise ratio(SNR)levels.Specifically,we firstly employ an SVT-based matrix completion method to estimate the noise-free covariance matrix.On this basis,the signal and noise subspaces are obtained from the eigendecomposition of the noise-free cov-ariance matrix.Finally,traditional subspace-based DOA estimation approaches can be directly ap-plied to determine the DOAs.Numerical simulations are performed to demonstrate the effective-ness of the proposed method.展开更多
Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermin...Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermined equations based on sparsity prior in singular values set of the unknown matrix,which also calls low-rank prior of the unknown matrix.This paper firstly introduces basic concept of matrix completion,analyses the matrix suitably used in matrix completion,and shows that such matrix should satisfy two conditions:low rank and incoherence property.Then the paper provides three reconstruction algorithms commonly used in matrix completion:singular value thresholding algorithm,singular value projection,and atomic decomposition for minimum rank approximation,puts forward their shortcoming to know the rank of original matrix.The Projected Gradient Descent based on Soft Thresholding(STPGD),proposed in this paper predicts the rank of unknown matrix using soft thresholding,and iteratives based on projected gradient descent,thus it could estimate the rank of unknown matrix exactly with low computational complexity,this is verified by numerical experiments.We also analyze the convergence and computational complexity of the STPGD algorithm,point out this algorithm is guaranteed to converge,and analyse the number of iterations needed to reach reconstruction error.Compared the computational complexity of the STPGD algorithm to other algorithms,we draw the conclusion that the STPGD algorithm not only reduces the computational complexity,but also improves the precision of the reconstruction solution.展开更多
In matrix completion,additional covariates often provide valuable information for completing the unobserved entries of a high-dimensional low-rank matrix A.In this paper,the authors consider the matrix recovery proble...In matrix completion,additional covariates often provide valuable information for completing the unobserved entries of a high-dimensional low-rank matrix A.In this paper,the authors consider the matrix recovery problem when there are multiple structural breaks in the coefficient matrix β under the column-space-decomposition model A=Xβ+B.A cumulative sum(CUSUM)statistic is constructed based on the penalized estimation of β.Then the CUSUM is incorporated into the Wild Binary Segmentation(WBS)algorithm to consistently estimate the location of breaks.Consequently,a nearly-optimal recovery of A is fulfilled.Theoretical findings are further corroborated via numerical experiments and a real-data application.展开更多
In this paper,we study the low-rank matrix completion problem with Poisson observations,where only partial entries are available and the observations are in the presence of Poisson noise.We propose a novel model compo...In this paper,we study the low-rank matrix completion problem with Poisson observations,where only partial entries are available and the observations are in the presence of Poisson noise.We propose a novel model composed of the Kullback-Leibler(KL)divergence by using the maximum likelihood estimation of Poisson noise,and total variation(TV)and nuclear norm constraints.Here the nuclear norm and TV constraints are utilized to explore the approximate low-rankness and piecewise smoothness of the underlying matrix,respectively.The advantage of these two constraints in the proposed model is that the low-rankness and piecewise smoothness of the underlying matrix can be exploited simultaneously,and they can be regularized for many real-world image data.An upper error bound of the estimator of the proposed model is established with high probability,which is not larger than that of only TV or nuclear norm constraint.To the best of our knowledge,this is the first work to utilize both low-rank and TV constraints with theoretical error bounds for matrix completion under Poisson observations.Extensive numerical examples on both synthetic data and real-world images are reported to corroborate the superiority of the proposed approach.展开更多
This paper introduces an algorithm for the nonnegative matrix factorization-and-completion problem, which aims to find nonnegative low-rank matrices X and Y so that the product XY approximates a nonnegative data matri...This paper introduces an algorithm for the nonnegative matrix factorization-and-completion problem, which aims to find nonnegative low-rank matrices X and Y so that the product XY approximates a nonnegative data matrix M whose elements are partially known (to a certain accuracy). This problem aggregates two existing problems: (i) nonnegative matrix factorization where all entries of M are given, and (ii) low-rank matrix completion where non- negativity is not required. By taking the advantages of both nonnegativity and low-rankness, one can generally obtain superior results than those of just using one of the two properties. We propose to solve the non-convex constrained least-squares problem using an algorithm based on tile classical alternating direction augmented Lagrangian method. Preliminary convergence properties of the algorithm and numerical simulation results are presented. Compared to a recent algorithm for nonnegative matrix factorization, the proposed algorithm produces factorizations of similar quality using only about half of the matrix entries. On tasks of recovering incomplete grayscale and hyperspeetral images, the proposed algorithm yields overall better qualities than those produced by two recent matrix-completion algorithms that do not exploit nonnegativity.展开更多
Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the...Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the observed data inevitably involve some censored instances,and thus brings a unique challenge that distinguishes from the general regression problems.In addition,survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems.To address these challenges,we propose a novel multi-task regression learning model,i.e.,prior information guided transductive matrix completion(PigTMC)model,to predict the survival status of the new instances.Specifically,we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples,and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data.In addition,we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis.Furthermore,we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model.Finally,the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods.展开更多
We investigate the problem of robust matrix completion with a fraction of observation corrupted by sparsity outlier noise.We propose an algorithmic framework based on the ADMM algorithm for a non-convex optimization,w...We investigate the problem of robust matrix completion with a fraction of observation corrupted by sparsity outlier noise.We propose an algorithmic framework based on the ADMM algorithm for a non-convex optimization,whose objective function consists of an l1 norm data fidelity and a rank constraint.To reduce the computational cost per iteration,two inexact schemes are developed to replace the most time-consuming step in the generic ADMM algorithm.The resulting algorithms remarkably outperform the existing solvers for robust matrix completion with outlier noise.When the noise is severe and the underlying matrix is ill-conditioned,the proposed algorithms are faster and give more accurate solutions than state-of-the-art robust matrix completion approaches.展开更多
In this paper,we propose a decentralized algorithm to solve the low-rank matrix completion problem and analyze its privacy-preserving property.Suppose that we want to recover a low-rank matrix D=[D1,D2,・・・,DL]from a s...In this paper,we propose a decentralized algorithm to solve the low-rank matrix completion problem and analyze its privacy-preserving property.Suppose that we want to recover a low-rank matrix D=[D1,D2,・・・,DL]from a subset of its entries.In a network composed of L agents,each agent i observes some entries of Di.We factorize the unknown matrix D as the product of a public matrix X which is common to all agents and a private matrix Y=[Y1,Y2,・・・,YL]of which Yi is held by agent i only.Each agent i updates Yi and its local estimate of X,denoted by X(i),in an alternating manner.Through exchanging information with neighbors,all the agents move toward a consensus on the estimates X(i).Once the consensus is(nearly)reached throughout the network,each agent i recovers Di=X(i)Yi,thus D is recovered.In this progress,communication through the network may disclose sensitive information about the data matrices Di to a malicious agent.We prove that in the proposed algorithm,D-LMaFit,if the network topology is well designed,the malicious agent is unable to reconstruct the sensitive information from others.展开更多
The semidefinite matrix completion(SMC) problem is to recover a low-rank positive semidefinite matrix from a small subset of its entries. It is well known but NP-hard in general. We first show that under some cases, S...The semidefinite matrix completion(SMC) problem is to recover a low-rank positive semidefinite matrix from a small subset of its entries. It is well known but NP-hard in general. We first show that under some cases, SMC problem and S1/2relaxation model share a unique solution. Then we prove that the global optimal solutions of S1/2regularization model are fixed points of a symmetric matrix half thresholding operator. We give an iterative scheme for solving S1/2regularization model and state convergence analysis of the iterative sequence.Through the optimal regularization parameter setting together with truncation techniques, we develop an HTE algorithm for S1/2regularization model, and numerical experiments confirm the efficiency and robustness of the proposed algorithm.展开更多
Linear programming models have been widely used in input-output analysis for analyzing the interdependence of industries in economics and in environmental science.In these applications,some of the entries of the coeff...Linear programming models have been widely used in input-output analysis for analyzing the interdependence of industries in economics and in environmental science.In these applications,some of the entries of the coefficient matrix cannot be measured physically or there exists sampling errors.However,the coefficient matrix can often be low-rank.We characterize the robust counterpart of these types of linear programming problems with uncertainty set described by the nuclear norm.Simulations for the input-output analysis show that the new paradigm can be helpful.展开更多
Based on the idea of maximum determinant positive definite matrix completion,Yamashita(Math Prog 115(1):1–30,2008)proposed a new sparse quasi-Newton update,called MCQN,for unconstrained optimization problems with spa...Based on the idea of maximum determinant positive definite matrix completion,Yamashita(Math Prog 115(1):1–30,2008)proposed a new sparse quasi-Newton update,called MCQN,for unconstrained optimization problems with sparse Hessian structures.In exchange of the relaxation of the secant equation,the MCQN update avoids solving difficult subproblems and overcomes the ill-conditioning of approximate Hessian matrices.However,local and superlinear convergence results were only established for the MCQN update with the DFP method.In this paper,we extend the convergence result to the MCQN update with the whole Broyden’s convex family.Numerical results are also reported,which suggest some efficient ways of choosing the parameter in the MCQN update the Broyden’s family.展开更多
In a matrix-completion problem the aim is to specify the missing entries of a matrix in order to produce a matrix with particular properties. In this paper we survey results concerning matrix-completion problems where...In a matrix-completion problem the aim is to specify the missing entries of a matrix in order to produce a matrix with particular properties. In this paper we survey results concerning matrix-completion problems where we look for completions of various types for partial matrices supported on a given pattern. We see that the existence of completions of the required type often depends on the chordal properties of graphs associated with the pattern.展开更多
In this paper, a unified matrix recovery model was proposed for diverse corrupted matrices. Resulting from the separable structure of the proposed model, the convex optimization problem can be solved efficiently by ad...In this paper, a unified matrix recovery model was proposed for diverse corrupted matrices. Resulting from the separable structure of the proposed model, the convex optimization problem can be solved efficiently by adopting an inexact augmented Lagrange multiplier (IALM) method. Additionally, a random projection accelerated technique (IALM+RP) was adopted to improve the success rate. From the preliminary numerical comparisons, it was indicated that for the standard robust principal component analysis (PCA) problem, IALM+RP was at least two to six times faster than IALM with an insignificant reduction in accuracy; and for the outlier pursuit (OP) problem, IALM+RP was at least 6.9 times faster, even up to 8.3 times faster when the size of matrix was 2 000×2 000.展开更多
Recovering an unknown high dimensional low rank matrix from a small set of entries is widely spread in the fields of machine learning,system identification and image restoration,etc.In many practical applications,the ...Recovering an unknown high dimensional low rank matrix from a small set of entries is widely spread in the fields of machine learning,system identification and image restoration,etc.In many practical applications,the few observations are always corrupted by noise and the noise level is also unknown.A novel model with nuclear norm and square root type estimator has been proposed,which does not rely on the knowledge or on an estimation of the standard deviation of the noise.In this paper,we firstly reformulate the problem to an equivalent variable separated form by introducing an auxiliary variable.Then we propose an efficient alternating direction method of multipliers(ADMM)for solving it.Both of resulting subproblems admit an explicit solution,which makes our algorithm have a cheap computing.Finally,the numerical results show the benefits of the model and the efficiency of the proposed method.展开更多
Based on the combination of Racah's group-theoretical consideration with Slater's wavefunction, a 91 ×91 complete energy matrix is established in tetragonal ligand field D2d for Pr3+ ion. Thus, the Stark energ...Based on the combination of Racah's group-theoretical consideration with Slater's wavefunction, a 91 ×91 complete energy matrix is established in tetragonal ligand field D2d for Pr3+ ion. Thus, the Stark energy-levels of Pr3+ ions doped separately in LiYF4 and LiBiF4 crystals are calculated, and our calculations imply that the complete energy matrix method can be used as an effective tool to calculate the energy-levels of the systems doped by rare earth ions. Besides, the influence of Pr3+ on energy-level splitting is investigated, and the similarities and the differences between the two doped crystals are demonstrated in detail by comparing their several pairs of curves and crystal field strength quantities. We see that the energy splitting patterns are similar and the crystal field interaction of LiYF4:Pr3+ is stronger than that of LiBiF4:Pr3+.展开更多
In order to decrease the probability of missing some data points or noises being added in the inverse truncated mixing matrix (ITMM) algorithm, a two-stage frequency- domain method is proposed for blind source separ...In order to decrease the probability of missing some data points or noises being added in the inverse truncated mixing matrix (ITMM) algorithm, a two-stage frequency- domain method is proposed for blind source separation of underdetermined instantaneous mixtures. The separation process is decomposed into two steps of ITMM and matrix completion in the view that there are many soft-sparse (not very sparse) sources. First, the mixing matrix is estimated and the sources are recovered by the traditional ITMM algorithm in the frequency domain. Then, in order to retrieve the missing data and remove noises, the matrix completion technique is applied to each preliminary estimated source by the traditional ITMM algorithm in the frequency domain. Simulations show that, compared with the traditional ITMM algorithms, the proposed two-stage algorithm has better separation performances. In addition, the time consumption problem is considered. The proposed algorithm outperforms the traditional ITMM algorithm at a cost of no more than one- fourth extra time consumption.展开更多
A new first-order optimality condition for the basis pursuit denoise (BPDN) problem is derived. This condition provides a new approach to choose the penalty param- eters adaptively for a fixed point iteration algori...A new first-order optimality condition for the basis pursuit denoise (BPDN) problem is derived. This condition provides a new approach to choose the penalty param- eters adaptively for a fixed point iteration algorithm. Meanwhile, the result is extended to matrix completion which is a new field on the heel of the compressed sensing. The numerical experiments of sparse vector recovery and low-rank matrix completion show validity of the theoretic results.展开更多
文摘Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.
基金Projects(11661069,61763041) supported by the National Natural Science Foundation of ChinaProject(IRT_15R40) supported by Changjiang Scholars and Innovative Research Team in University,ChinaProject(2017TS045) supported by the Fundamental Research Funds for the Central Universities,China
文摘Most existing network representation learning algorithms focus on network structures for learning.However,network structure is only one kind of view and feature for various networks,and it cannot fully reflect all characteristics of networks.In fact,network vertices usually contain rich text information,which can be well utilized to learn text-enhanced network representations.Meanwhile,Matrix-Forest Index(MFI)has shown its high effectiveness and stability in link prediction tasks compared with other algorithms of link prediction.Both MFI and Inductive Matrix Completion(IMC)are not well applied with algorithmic frameworks of typical representation learning methods.Therefore,we proposed a novel semi-supervised algorithm,tri-party deep network representation learning using inductive matrix completion(TDNR).Based on inductive matrix completion algorithm,TDNR incorporates text features,the link certainty degrees of existing edges and the future link probabilities of non-existing edges into network representations.The experimental results demonstrated that TFNR outperforms other baselines on three real-world datasets.The visualizations of TDNR show that proposed algorithm is more discriminative than other unsupervised approaches.
基金supported in part by Sub Project of National Key Research and Development plan in 2020.NO.2020YFC1511704Beijing Information Science and Technology University.NO.2020KYNH212,NO.2021CGZH302+1 种基金Beijing Science and Technology Project(Grant No.Z211100004421009)in part by the National Natural Science Foundation of China(Grant No.61971048)。
文摘The problem of high-precision indoor positioning in the 5G era has attracted more and more attention.A fingerprint location method based on matrix completion(MC-FPL)is proposed for 5G ultradense networks to overcome the high costs of traditional fingerprint database construction and matching algorithms.First,a partial fingerprint database constructed and the accelerated proximal gradient algorithm is used to fill the partial fingerprint database to construct a full fingerprint database.Second,a fingerprint database division method based on the strongest received signal strength indicator is proposed,which divides the original fingerprint database into several sub-fingerprint databases.Finally,a classification weighted K-nearest neighbor fingerprint matching algorithm is proposed.The estimated coordinates of the point to be located can be obtained by fingerprint matching in a sub-fingerprint database.The simulation results show that the MC-FPL algorithm can reduce the complexity of database construction and fingerprint matching and has higher positioning accuracy compared with the traditional fingerprint algorithm.
基金the National Natural Science Foundation of China(No.61771316).
文摘Usually,the problem of direction-of-arrival(DOA)estimation is performed based on the assumption of uniform noise.In many applications,however,the noise across the array may be nonuniform.In this situation,the performance of DOA estimators may be deteriorated greatly if the non-uniformity of noise is ignored.To tackle this problem,we consider the problem of DOA es-timation in the presence of nonuniform noise by leveraging a singular value thresholding(SVT)based matrix completion method.Different from that the traditional SVT method apply fixed threshold,to improve the performance,the proposed method can obtain a more suitable threshold based on careful estimation of the signal-to-noise ratio(SNR)levels.Specifically,we firstly employ an SVT-based matrix completion method to estimate the noise-free covariance matrix.On this basis,the signal and noise subspaces are obtained from the eigendecomposition of the noise-free cov-ariance matrix.Finally,traditional subspace-based DOA estimation approaches can be directly ap-plied to determine the DOAs.Numerical simulations are performed to demonstrate the effective-ness of the proposed method.
基金Supported by the National Natural Science Foundation ofChina(No.61271240)Jiangsu Province Natural Science Fund Project(No.BK2010077)Subject of Twelfth Five Years Plans in Jiangsu Second Normal University(No.417103)
文摘Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermined equations based on sparsity prior in singular values set of the unknown matrix,which also calls low-rank prior of the unknown matrix.This paper firstly introduces basic concept of matrix completion,analyses the matrix suitably used in matrix completion,and shows that such matrix should satisfy two conditions:low rank and incoherence property.Then the paper provides three reconstruction algorithms commonly used in matrix completion:singular value thresholding algorithm,singular value projection,and atomic decomposition for minimum rank approximation,puts forward their shortcoming to know the rank of original matrix.The Projected Gradient Descent based on Soft Thresholding(STPGD),proposed in this paper predicts the rank of unknown matrix using soft thresholding,and iteratives based on projected gradient descent,thus it could estimate the rank of unknown matrix exactly with low computational complexity,this is verified by numerical experiments.We also analyze the convergence and computational complexity of the STPGD algorithm,point out this algorithm is guaranteed to converge,and analyse the number of iterations needed to reach reconstruction error.Compared the computational complexity of the STPGD algorithm to other algorithms,we draw the conclusion that the STPGD algorithm not only reduces the computational complexity,but also improves the precision of the reconstruction solution.
基金supported by the National Natural Science Foundation of China under Grant Nos.12226007,12271271,11925106,12231011,11931001 and 11971247the Fundamental Research Funds for the Central Universities under Grant No.ZB22000105the China National Key R&D Program under Grant Nos.2022YFA1003703,2022YFA1003800,and 2019YFC1908502.
文摘In matrix completion,additional covariates often provide valuable information for completing the unobserved entries of a high-dimensional low-rank matrix A.In this paper,the authors consider the matrix recovery problem when there are multiple structural breaks in the coefficient matrix β under the column-space-decomposition model A=Xβ+B.A cumulative sum(CUSUM)statistic is constructed based on the penalized estimation of β.Then the CUSUM is incorporated into the Wild Binary Segmentation(WBS)algorithm to consistently estimate the location of breaks.Consequently,a nearly-optimal recovery of A is fulfilled.Theoretical findings are further corroborated via numerical experiments and a real-data application.
基金supported in part by the National Natural Science Foundation of China(Grant No.12201473)by the Science Foundation of Wuhan Institute of Technology(Grant No.K202256)+3 种基金The research of M.K.Ng was supported in part by the HKRGC GRF(Grant Nos.12300218,12300519,17201020,17300021)The research of X.Zhang was supported in part by the National Natural Science Foundation of China(Grant No.12171189)by the Knowledge Innovation Project of Wuhan(Grant No.2022010801020279)by the Fundamental Research Funds for the Central Universities(Grant No.CCNU22JC023).
文摘In this paper,we study the low-rank matrix completion problem with Poisson observations,where only partial entries are available and the observations are in the presence of Poisson noise.We propose a novel model composed of the Kullback-Leibler(KL)divergence by using the maximum likelihood estimation of Poisson noise,and total variation(TV)and nuclear norm constraints.Here the nuclear norm and TV constraints are utilized to explore the approximate low-rankness and piecewise smoothness of the underlying matrix,respectively.The advantage of these two constraints in the proposed model is that the low-rankness and piecewise smoothness of the underlying matrix can be exploited simultaneously,and they can be regularized for many real-world image data.An upper error bound of the estimator of the proposed model is established with high probability,which is not larger than that of only TV or nuclear norm constraint.To the best of our knowledge,this is the first work to utilize both low-rank and TV constraints with theoretical error bounds for matrix completion under Poisson observations.Extensive numerical examples on both synthetic data and real-world images are reported to corroborate the superiority of the proposed approach.
文摘This paper introduces an algorithm for the nonnegative matrix factorization-and-completion problem, which aims to find nonnegative low-rank matrices X and Y so that the product XY approximates a nonnegative data matrix M whose elements are partially known (to a certain accuracy). This problem aggregates two existing problems: (i) nonnegative matrix factorization where all entries of M are given, and (ii) low-rank matrix completion where non- negativity is not required. By taking the advantages of both nonnegativity and low-rankness, one can generally obtain superior results than those of just using one of the two properties. We propose to solve the non-convex constrained least-squares problem using an algorithm based on tile classical alternating direction augmented Lagrangian method. Preliminary convergence properties of the algorithm and numerical simulation results are presented. Compared to a recent algorithm for nonnegative matrix factorization, the proposed algorithm produces factorizations of similar quality using only about half of the matrix entries. On tasks of recovering incomplete grayscale and hyperspeetral images, the proposed algorithm yields overall better qualities than those produced by two recent matrix-completion algorithms that do not exploit nonnegativity.
基金This work was supported in part by the National Natural Science Foundation of China(Grant Nos.61872190,61772285,61572263 and 61906098)in part by the Natural Science Foundation of Jiangsu Province(BK20161516)in part by the Open Fund of MIIT Key Laboratory of Pattern Analysis and Machine Intelligence of NUAA.
文摘Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the observed data inevitably involve some censored instances,and thus brings a unique challenge that distinguishes from the general regression problems.In addition,survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems.To address these challenges,we propose a novel multi-task regression learning model,i.e.,prior information guided transductive matrix completion(PigTMC)model,to predict the survival status of the new instances.Specifically,we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples,and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data.In addition,we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis.Furthermore,we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model.Finally,the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods.
基金JL was supported by China Postdoctoral Science Foundation grant No.2017M620589JFC was supported in part by Hong Kong Research Grant Council(HKRGC)grants 16300616 and 16306317HK Zhao was supported in part by NSF grants DMS-1418422 and DMS-1622490.
文摘We investigate the problem of robust matrix completion with a fraction of observation corrupted by sparsity outlier noise.We propose an algorithmic framework based on the ADMM algorithm for a non-convex optimization,whose objective function consists of an l1 norm data fidelity and a rank constraint.To reduce the computational cost per iteration,two inexact schemes are developed to replace the most time-consuming step in the generic ADMM algorithm.The resulting algorithms remarkably outperform the existing solvers for robust matrix completion with outlier noise.When the noise is severe and the underlying matrix is ill-conditioned,the proposed algorithms are faster and give more accurate solutions than state-of-the-art robust matrix completion approaches.
文摘In this paper,we propose a decentralized algorithm to solve the low-rank matrix completion problem and analyze its privacy-preserving property.Suppose that we want to recover a low-rank matrix D=[D1,D2,・・・,DL]from a subset of its entries.In a network composed of L agents,each agent i observes some entries of Di.We factorize the unknown matrix D as the product of a public matrix X which is common to all agents and a private matrix Y=[Y1,Y2,・・・,YL]of which Yi is held by agent i only.Each agent i updates Yi and its local estimate of X,denoted by X(i),in an alternating manner.Through exchanging information with neighbors,all the agents move toward a consensus on the estimates X(i).Once the consensus is(nearly)reached throughout the network,each agent i recovers Di=X(i)Yi,thus D is recovered.In this progress,communication through the network may disclose sensitive information about the data matrices Di to a malicious agent.We prove that in the proposed algorithm,D-LMaFit,if the network topology is well designed,the malicious agent is unable to reconstruct the sensitive information from others.
基金supported by National Natural Science Foundation of China(Grant Nos.11431002,71271021 and 11301022)the Fundamental Research Funds for the Central Universities of China(Grant No.2012YJS118)
文摘The semidefinite matrix completion(SMC) problem is to recover a low-rank positive semidefinite matrix from a small subset of its entries. It is well known but NP-hard in general. We first show that under some cases, SMC problem and S1/2relaxation model share a unique solution. Then we prove that the global optimal solutions of S1/2regularization model are fixed points of a symmetric matrix half thresholding operator. We give an iterative scheme for solving S1/2regularization model and state convergence analysis of the iterative sequence.Through the optimal regularization parameter setting together with truncation techniques, we develop an HTE algorithm for S1/2regularization model, and numerical experiments confirm the efficiency and robustness of the proposed algorithm.
基金supported by National Social Science Foundation of China (Grant No. 11BGL053)National Natural Science Foundation of China (Grant Nos. 11101434,10971122 and 11101274)+4 种基金Scientific and Technological Projects of Shandong Province (Grant No. 2009GG10001012)Excellent Young Scientist Foundation of Shandong Province (Grant No. 2010BSE06047)the Doctoral Program of Higher Education of China (Grant No. 20110073120069)Shandong Province Natural Science Foundation (Grant No. ZR2012GQ004)Independent Innovation Foundation of Shandong University (Grant No. 12120083399170)
文摘Linear programming models have been widely used in input-output analysis for analyzing the interdependence of industries in economics and in environmental science.In these applications,some of the entries of the coefficient matrix cannot be measured physically or there exists sampling errors.However,the coefficient matrix can often be low-rank.We characterize the robust counterpart of these types of linear programming problems with uncertainty set described by the nuclear norm.Simulations for the input-output analysis show that the new paradigm can be helpful.
基金This work was supported by the Chinese NSF Grants(Nos.11331012 and 81173633)the China National Funds for Distinguished Young Scientists(No.11125107)+1 种基金the CAS Program for Cross&Coorperative Team of the Science&Technology InnovationThe authors are grateful to Professors Masao Fukushima and Ya-xiang Yuan for their warm encouragement and valuable suggestions.They also thank the two anonymous referees very much for their useful comments on an early version of this paper.
文摘Based on the idea of maximum determinant positive definite matrix completion,Yamashita(Math Prog 115(1):1–30,2008)proposed a new sparse quasi-Newton update,called MCQN,for unconstrained optimization problems with sparse Hessian structures.In exchange of the relaxation of the secant equation,the MCQN update avoids solving difficult subproblems and overcomes the ill-conditioning of approximate Hessian matrices.However,local and superlinear convergence results were only established for the MCQN update with the DFP method.In this paper,we extend the convergence result to the MCQN update with the whole Broyden’s convex family.Numerical results are also reported,which suggest some efficient ways of choosing the parameter in the MCQN update the Broyden’s family.
文摘In a matrix-completion problem the aim is to specify the missing entries of a matrix in order to produce a matrix with particular properties. In this paper we survey results concerning matrix-completion problems where we look for completions of various types for partial matrices supported on a given pattern. We see that the existence of completions of the required type often depends on the chordal properties of graphs associated with the pattern.
基金Supported by National Natural Science Foundation of China (No.51275348)College Students Innovation and Entrepreneurship Training Program of Tianjin University (No.201210056339)
文摘In this paper, a unified matrix recovery model was proposed for diverse corrupted matrices. Resulting from the separable structure of the proposed model, the convex optimization problem can be solved efficiently by adopting an inexact augmented Lagrange multiplier (IALM) method. Additionally, a random projection accelerated technique (IALM+RP) was adopted to improve the success rate. From the preliminary numerical comparisons, it was indicated that for the standard robust principal component analysis (PCA) problem, IALM+RP was at least two to six times faster than IALM with an insignificant reduction in accuracy; and for the outlier pursuit (OP) problem, IALM+RP was at least 6.9 times faster, even up to 8.3 times faster when the size of matrix was 2 000×2 000.
基金Supported by the National Natural Science Foundation of China(Grant No.11971149,12101195,12071112,11871383)Natural Science Foundation of Henan Province for Youth(Grant No.202300410146).
文摘Recovering an unknown high dimensional low rank matrix from a small set of entries is widely spread in the fields of machine learning,system identification and image restoration,etc.In many practical applications,the few observations are always corrupted by noise and the noise level is also unknown.A novel model with nuclear norm and square root type estimator has been proposed,which does not rely on the knowledge or on an estimation of the standard deviation of the noise.In this paper,we firstly reformulate the problem to an equivalent variable separated form by introducing an auxiliary variable.Then we propose an efficient alternating direction method of multipliers(ADMM)for solving it.Both of resulting subproblems admit an explicit solution,which makes our algorithm have a cheap computing.Finally,the numerical results show the benefits of the model and the efficiency of the proposed method.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.10774103 and 10974138)
文摘Based on the combination of Racah's group-theoretical consideration with Slater's wavefunction, a 91 ×91 complete energy matrix is established in tetragonal ligand field D2d for Pr3+ ion. Thus, the Stark energy-levels of Pr3+ ions doped separately in LiYF4 and LiBiF4 crystals are calculated, and our calculations imply that the complete energy matrix method can be used as an effective tool to calculate the energy-levels of the systems doped by rare earth ions. Besides, the influence of Pr3+ on energy-level splitting is investigated, and the similarities and the differences between the two doped crystals are demonstrated in detail by comparing their several pairs of curves and crystal field strength quantities. We see that the energy splitting patterns are similar and the crystal field interaction of LiYF4:Pr3+ is stronger than that of LiBiF4:Pr3+.
基金The National Natural Science Foundation of China(No.60872074)
文摘In order to decrease the probability of missing some data points or noises being added in the inverse truncated mixing matrix (ITMM) algorithm, a two-stage frequency- domain method is proposed for blind source separation of underdetermined instantaneous mixtures. The separation process is decomposed into two steps of ITMM and matrix completion in the view that there are many soft-sparse (not very sparse) sources. First, the mixing matrix is estimated and the sources are recovered by the traditional ITMM algorithm in the frequency domain. Then, in order to retrieve the missing data and remove noises, the matrix completion technique is applied to each preliminary estimated source by the traditional ITMM algorithm in the frequency domain. Simulations show that, compared with the traditional ITMM algorithms, the proposed two-stage algorithm has better separation performances. In addition, the time consumption problem is considered. The proposed algorithm outperforms the traditional ITMM algorithm at a cost of no more than one- fourth extra time consumption.
基金supported by the National Natural Science Foundation of China(No.61271014)the Specialized Research Fund for the Doctoral Program of Higher Education(No.20124301110003)the Graduated Students Innovation Fund of Hunan Province(No.CX2012B238)
文摘A new first-order optimality condition for the basis pursuit denoise (BPDN) problem is derived. This condition provides a new approach to choose the penalty param- eters adaptively for a fixed point iteration algorithm. Meanwhile, the result is extended to matrix completion which is a new field on the heel of the compressed sensing. The numerical experiments of sparse vector recovery and low-rank matrix completion show validity of the theoretic results.