We consider a fundamental problem in the field of machine learning—structural risk minimization,which can be represented as the average of a large number of smooth component functions plus a simple and convex(but pos...We consider a fundamental problem in the field of machine learning—structural risk minimization,which can be represented as the average of a large number of smooth component functions plus a simple and convex(but possibly non-smooth)function.In this paper,we propose a novel proximal variance reducing stochastic method building on the introduced Point-SAGA.Our method achieves two proximal operator calculations by combining the fast Douglas–Rachford splitting and refers to the scheme of the FISTA algorithm in the choice of momentum factors.We show that the objective function value converges to the iteration point at the rate of O(1/k)when each loss function is convex and smooth.In addition,we prove that our method achieves a linear convergence rate for strongly convex and smooth loss functions.Experiments demonstrate the effectiveness of the proposed algorithm,especially when the loss function is ill-conditioned with good acceleration.展开更多
文摘We consider a fundamental problem in the field of machine learning—structural risk minimization,which can be represented as the average of a large number of smooth component functions plus a simple and convex(but possibly non-smooth)function.In this paper,we propose a novel proximal variance reducing stochastic method building on the introduced Point-SAGA.Our method achieves two proximal operator calculations by combining the fast Douglas–Rachford splitting and refers to the scheme of the FISTA algorithm in the choice of momentum factors.We show that the objective function value converges to the iteration point at the rate of O(1/k)when each loss function is convex and smooth.In addition,we prove that our method achieves a linear convergence rate for strongly convex and smooth loss functions.Experiments demonstrate the effectiveness of the proposed algorithm,especially when the loss function is ill-conditioned with good acceleration.