Proximal gradient descent and its accelerated version are resultful methods for solving the sum of smooth and non-smooth problems. When the smooth function can be represented as a sum of multiple functions, the stocha...Proximal gradient descent and its accelerated version are resultful methods for solving the sum of smooth and non-smooth problems. When the smooth function can be represented as a sum of multiple functions, the stochastic proximal gradient method performs well. However, research on its accelerated version remains unclear. This paper proposes a proximal stochastic accelerated gradient (PSAG) method to address problems involving a combination of smooth and non-smooth components, where the smooth part corresponds to the average of multiple block sums. Simultaneously, most of convergence analyses hold in expectation. To this end, under some mind conditions, we present an almost sure convergence of unbiased gradient estimation in the non-smooth setting. Moreover, we establish that the minimum of the squared gradient mapping norm arbitrarily converges to zero with probability one.展开更多
Consider the problem of minimizing the sum of two convex functions,one being smooth and the other non-smooth.In this paper,we introduce a general class of approximate proximal splitting(APS)methods for solving such mi...Consider the problem of minimizing the sum of two convex functions,one being smooth and the other non-smooth.In this paper,we introduce a general class of approximate proximal splitting(APS)methods for solving such minimization problems.Methods in the APS class include many well-known algorithms such as the proximal splitting method,the block coordinate descent method(BCD),and the approximate gradient projection methods for smooth convex optimization.We establish the linear convergence of APS methods under a local error bound assumption.Since the latter is known to hold for compressive sensing and sparse group LASSO problems,our analysis implies the linear convergence of the BCD method for these problems without strong convexity assumption.展开更多
The alternating direction method of multipliers(ADMM)is widely used in solving structured convex optimization problems.Despite its success in practice,the convergence of the standard ADMM for minimizing the sum of N(N...The alternating direction method of multipliers(ADMM)is widely used in solving structured convex optimization problems.Despite its success in practice,the convergence of the standard ADMM for minimizing the sum of N(N≥3)convex functions,whose variables are linked by linear constraints,has remained unclear for a very long time.Recently,Chen et al.(Math Program,doi:10.1007/s10107-014-0826-5,2014)provided a counter-example showing that the ADMM for N≥3 may fail to converge without further conditions.Since the ADMM for N≥3 has been very successful when applied to many problems arising from real practice,it is worth further investigating under what kind of sufficient conditions it can be guaranteed to converge.In this paper,we present such sufficient conditions that can guarantee the sublinear convergence rate for the ADMM for N≥3.Specifically,we show that if one of the functions is convex(not necessarily strongly convex)and the other N-1 functions are strongly convex,and the penalty parameter lies in a certain region,the ADMM converges with rate O(1/t)in a certain ergodic sense and o(1/t)in a certain non-ergodic sense,where t denotes the number of iterations.As a by-product,we also provide a simple proof for the O(1/t)convergence rate of two-blockADMMin terms of both objective error and constraint violation,without assuming any condition on the penalty parameter and strong convexity on the functions.展开更多
In this paper,we propose a modified proximal gradient method for solving a class of nonsmooth convex optimization problems,which arise in many contemporary statistical and signal processing applications.The proposed m...In this paper,we propose a modified proximal gradient method for solving a class of nonsmooth convex optimization problems,which arise in many contemporary statistical and signal processing applications.The proposed method adopts a new scheme to construct the descent direction based on the proximal gradient method.It is proven that the modified proximal gradient method is Q-linearly convergent without the assumption of the strong convexity of the objective function.Some numerical experiments have been conducted to evaluate the proposed method eventually.展开更多
The Alternating Direction Multiplier Method (ADMM) is widely used in various fields, and different variables are customized in the literature for different application scenarios [1] [2] [3] [4]. Among them, the linear...The Alternating Direction Multiplier Method (ADMM) is widely used in various fields, and different variables are customized in the literature for different application scenarios [1] [2] [3] [4]. Among them, the linearized alternating direction multiplier method (LADMM) has received extensive attention because of its effectiveness and ease of implementation. This paper mainly discusses the application of ADMM in dictionary learning (non-convex problem). Many numerical experiments show that to achieve higher convergence accuracy, the convergence speed of ADMM is slower, especially near the optimal solution. Therefore, we introduce the linearized alternating direction multiplier method (LADMM) to accelerate the convergence speed of ADMM. Specifically, the problem is solved by linearizing the quadratic term of the subproblem, and the convergence of the algorithm is proved. Finally, there is a brief summary of the full text.展开更多
Alternating direction method of multipliers(ADMM)receives much attention in the recent years due to various demands from machine learning and big data related optimization.In 2013,Ouyang et al.extend the ADMM to the s...Alternating direction method of multipliers(ADMM)receives much attention in the recent years due to various demands from machine learning and big data related optimization.In 2013,Ouyang et al.extend the ADMM to the stochastic setting for solving some stochastic optimization problems,inspired by the structural risk minimization principle.In this paper,we consider a stochastic variant of symmetric ADMM,named symmetric stochastic linearized ADMM(SSL-ADMM).In particular,using the framework of variational inequality,we analyze the convergence properties of SSL-ADMM.Moreover,we show that,with high probability,SSL-ADMM has O((ln N)·N^(-1/2))constraint violation bound and objective error bound for convex problems,and has O((ln N)^(2)·N^(-1))constraint violation bound and objective error bound for strongly convex problems,where N is the iteration number.Symmetric ADMM can improve the algorithmic performance compared to classical ADMM,numerical experiments for statistical machine learning show that such an improvement is also present in the stochastic setting.展开更多
文摘Proximal gradient descent and its accelerated version are resultful methods for solving the sum of smooth and non-smooth problems. When the smooth function can be represented as a sum of multiple functions, the stochastic proximal gradient method performs well. However, research on its accelerated version remains unclear. This paper proposes a proximal stochastic accelerated gradient (PSAG) method to address problems involving a combination of smooth and non-smooth components, where the smooth part corresponds to the average of multiple block sums. Simultaneously, most of convergence analyses hold in expectation. To this end, under some mind conditions, we present an almost sure convergence of unbiased gradient estimation in the non-smooth setting. Moreover, we establish that the minimum of the squared gradient mapping norm arbitrarily converges to zero with probability one.
文摘Consider the problem of minimizing the sum of two convex functions,one being smooth and the other non-smooth.In this paper,we introduce a general class of approximate proximal splitting(APS)methods for solving such minimization problems.Methods in the APS class include many well-known algorithms such as the proximal splitting method,the block coordinate descent method(BCD),and the approximate gradient projection methods for smooth convex optimization.We establish the linear convergence of APS methods under a local error bound assumption.Since the latter is known to hold for compressive sensing and sparse group LASSO problems,our analysis implies the linear convergence of the BCD method for these problems without strong convexity assumption.
基金The research of S.-Q.Ma was supported in part by the Hong Kong Research Grants Council General Research Fund Early Career Scheme(No.CUHK 439513)The research of S.-Z.Zhang was supported in part by the National Natural Science Foundation(No.CMMI 1161242).
文摘The alternating direction method of multipliers(ADMM)is widely used in solving structured convex optimization problems.Despite its success in practice,the convergence of the standard ADMM for minimizing the sum of N(N≥3)convex functions,whose variables are linked by linear constraints,has remained unclear for a very long time.Recently,Chen et al.(Math Program,doi:10.1007/s10107-014-0826-5,2014)provided a counter-example showing that the ADMM for N≥3 may fail to converge without further conditions.Since the ADMM for N≥3 has been very successful when applied to many problems arising from real practice,it is worth further investigating under what kind of sufficient conditions it can be guaranteed to converge.In this paper,we present such sufficient conditions that can guarantee the sublinear convergence rate for the ADMM for N≥3.Specifically,we show that if one of the functions is convex(not necessarily strongly convex)and the other N-1 functions are strongly convex,and the penalty parameter lies in a certain region,the ADMM converges with rate O(1/t)in a certain ergodic sense and o(1/t)in a certain non-ergodic sense,where t denotes the number of iterations.As a by-product,we also provide a simple proof for the O(1/t)convergence rate of two-blockADMMin terms of both objective error and constraint violation,without assuming any condition on the penalty parameter and strong convexity on the functions.
基金the National Natural Science Foundation of China(No.61179033).
文摘In this paper,we propose a modified proximal gradient method for solving a class of nonsmooth convex optimization problems,which arise in many contemporary statistical and signal processing applications.The proposed method adopts a new scheme to construct the descent direction based on the proximal gradient method.It is proven that the modified proximal gradient method is Q-linearly convergent without the assumption of the strong convexity of the objective function.Some numerical experiments have been conducted to evaluate the proposed method eventually.
文摘The Alternating Direction Multiplier Method (ADMM) is widely used in various fields, and different variables are customized in the literature for different application scenarios [1] [2] [3] [4]. Among them, the linearized alternating direction multiplier method (LADMM) has received extensive attention because of its effectiveness and ease of implementation. This paper mainly discusses the application of ADMM in dictionary learning (non-convex problem). Many numerical experiments show that to achieve higher convergence accuracy, the convergence speed of ADMM is slower, especially near the optimal solution. Therefore, we introduce the linearized alternating direction multiplier method (LADMM) to accelerate the convergence speed of ADMM. Specifically, the problem is solved by linearizing the quadratic term of the subproblem, and the convergence of the algorithm is proved. Finally, there is a brief summary of the full text.
基金Supported by National Natural Science Foundation of China (61662036)。
文摘Alternating direction method of multipliers(ADMM)receives much attention in the recent years due to various demands from machine learning and big data related optimization.In 2013,Ouyang et al.extend the ADMM to the stochastic setting for solving some stochastic optimization problems,inspired by the structural risk minimization principle.In this paper,we consider a stochastic variant of symmetric ADMM,named symmetric stochastic linearized ADMM(SSL-ADMM).In particular,using the framework of variational inequality,we analyze the convergence properties of SSL-ADMM.Moreover,we show that,with high probability,SSL-ADMM has O((ln N)·N^(-1/2))constraint violation bound and objective error bound for convex problems,and has O((ln N)^(2)·N^(-1))constraint violation bound and objective error bound for strongly convex problems,where N is the iteration number.Symmetric ADMM can improve the algorithmic performance compared to classical ADMM,numerical experiments for statistical machine learning show that such an improvement is also present in the stochastic setting.