期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Convergence of Stochastic Gradient Descent in Deep Neural Network 被引量:4
1
作者 Bai-cun ZHOU Cong-ying HAN tian-de guo 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2021年第1期126-136,共11页
Stochastic gradient descent(SGD) is one of the most common optimization algorithms used in pattern recognition and machine learning.This algorithm and its variants are the preferred algorithm while optimizing paramete... Stochastic gradient descent(SGD) is one of the most common optimization algorithms used in pattern recognition and machine learning.This algorithm and its variants are the preferred algorithm while optimizing parameters of deep neural network for their advantages of low storage space requirement and fast computation speed.Previous studies on convergence of these algorithms were based on some traditional assumptions in optimization problems.However,the deep neural network has its unique properties.Some assumptions are inappropriate in the actual optimization process of this kind of model.In this paper,we modify the assumptions to make them more consistent with the actual optimization process of deep neural network.Based on new assumptions,we studied the convergence and convergence rate of SGD and its two common variant algorithms.In addition,we carried out numerical experiments with LeNet-5,a common network framework,on the data set MNIST to verify the rationality of our assumptions. 展开更多
关键词 stochastic gradient descent deep neural network CONVERGENCE
原文传递
An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning 被引量:2
2
作者 tian-de guo Yan Liu Cong-Ying Han 《Journal of the Operations Research Society of China》 EI CSCD 2023年第2期245-275,共31页
Numerous intriguing optimization problems arise as a result of the advancement of machine learning.The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency.However,the ne... Numerous intriguing optimization problems arise as a result of the advancement of machine learning.The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency.However,the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate.Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems.This paper provides a review on recent developments in stochastic variants of quasi-Newton methods,which construct the Hessian approximations using only gradient information.We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios.Future research on stochastic quasi-Newton methods should focus on enhancing its applicability,lowering the computational and storage costs,and improving the convergence rate. 展开更多
关键词 Stochastic quasi-Newton methods BFGS Large-scale machine learning
原文传递
Parallel Algorithms for Large-scale Linearly Constrained Minimization Problem
3
作者 Cong-ying HAN Fang-ying ZHENG +1 位作者 tian-de guo guo-ping HE 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2014年第3期707-720,共14页
In this paper, two PVD-type algorithms are proposed for solving inseparable linear constraint optimization. Instead of computing the residual gradient function, the new algorithm uses the reduced gradients to construc... In this paper, two PVD-type algorithms are proposed for solving inseparable linear constraint optimization. Instead of computing the residual gradient function, the new algorithm uses the reduced gradients to construct the PVD directions in parallel computation, which can greatly reduce the computation amount each iteration and is closer to practical applications for solve large-scale nonlinear programming. Moreover, based on an active set computed by the coordinate rotation at each iteration, a feasible descent direction can be easily obtained by the extended reduced gradient method. The direction is then used as the PVD direction and a new PVD algorithm is proposed for the general linearly constrained optimization. And the global convergence is also proved. 展开更多
关键词 nonlinear programming large-scale minimization parallel algorithm constrained convex opti-mization
原文传递
New Discoveries of Domination Between Traffic Matrices
4
作者 Peng-fei LIU Wen-guo YANG tian-de guo 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2017年第3期561-566,共6页
In this paper the definition of domination is generalized to the case that the elements of the traffic matrices may have negative values. It is proved that D3 dominates D3 + λ(D2 - D1) for any λ ≥0 if D1 dominat... In this paper the definition of domination is generalized to the case that the elements of the traffic matrices may have negative values. It is proved that D3 dominates D3 + λ(D2 - D1) for any λ ≥0 if D1 dominates D2. Let u(D) be the set of all the traffic matrices that are dominated by the traffic matrix D. It is shown that u ( D∞) and u (D ∈) are isomorphic. Besides, similar results are obtained on multi-commodity flow problems. Fhrthermore, the results are the generalized to integral flows. 展开更多
关键词 robust network design multi-commodity flows DOMINATION
原文传递
The Coverage Holes of The Largest Component of Random Geometric Graph
5
作者 Chang-long YAO tian-de guo 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2015年第4期855-862,共8页
In this paper, a domain in a cube is called a coverage hole if it is not covered by the largest component of the random geometric graph in this cube. We obtain asymptotic properties of the size of the largest coverage... In this paper, a domain in a cube is called a coverage hole if it is not covered by the largest component of the random geometric graph in this cube. We obtain asymptotic properties of the size of the largest coverage hole in the cube. In addition, we give an exponentially decaying tail bound for the probability that a line with length s do not intersect with the coverage of the infinite component of continuum percolation. These results have applications in communication networks and especially in wireless ad-hoc sensor networks. 展开更多
关键词 random geometric graph continuum percolation wireless sensor networks coverage
原文传递
Preface
6
作者 tian-de guo 《Journal of the Operations Research Society of China》 EI CSCD 2023年第2期243-244,共2页
Optimization is the theoretical and algorithmic basis of artificial intelligence.Professor Michael Jordan,member of National Academy of Sciences,member of National Academy of Engineering,member of the American Academy... Optimization is the theoretical and algorithmic basis of artificial intelligence.Professor Michael Jordan,member of National Academy of Sciences,member of National Academy of Engineering,member of the American Academy of Art and Sciences,pointed out in his one-hour report at the International Congress of Mathematicians that the change of artificial intelligence era comes from optimization algorithms in application scenarios.Machine learning,in turn,offers many new ideas for optimizing models and algorithms.The history of machine learning is the process of fusion of different learning models and optimization algorithms. 展开更多
关键词 artificial LEARNING OPTIMIZATION
原文传递
SEQUENTIAL SYSTEMS OF LINEAR EQUATIONS ALGORITHM FOR NONLINEAR OPTIMIZATION PROBLEMS-INEQUALITY CONSTRAINED PROBLEMS 被引量:5
7
作者 Zi-you Gao tian-de guo +1 位作者 guo-ping He Fang Wu 《Journal of Computational Mathematics》 SCIE CSCD 2002年第3期301-312,共12页
Presents information on a study which proposed a superlinearly convergent algorithm of sequential systems of linear equations or nonlinear optimization problems with inequality constraints. Assumptions; Discussion on ... Presents information on a study which proposed a superlinearly convergent algorithm of sequential systems of linear equations or nonlinear optimization problems with inequality constraints. Assumptions; Discussion on lemmas about several matrices related to the common coefficient matrix F; Strengthening of the regularity assumptions on the functions involved; Numerical experiments. 展开更多
关键词 OPTIMIZATION inequality constraints ALGORITHMS sequential systems of linear equations coefficient matrices superlinear convergence
全文增补中
上一页 1 下一页 到第
使用帮助 返回顶部