期刊文献+
共找到1,480篇文章
< 1 2 74 >
每页显示 20 50 100
L_(1)-Smooth SVM with Distributed Adaptive Proximal Stochastic Gradient Descent with Momentum for Fast Brain Tumor Detection
1
作者 Chuandong Qin Yu Cao Liqun Meng 《Computers, Materials & Continua》 SCIE EI 2024年第5期1975-1994,共20页
Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for ga... Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for gaining a deeper understanding of tumors and improving treatment outcomes.Machine learning models have become key players in automating brain tumor detection.Gradient descent methods are the mainstream algorithms for solving machine learning models.In this paper,we propose a novel distributed proximal stochastic gradient descent approach to solve the L_(1)-Smooth Support Vector Machine(SVM)classifier for brain tumor detection.Firstly,the smooth hinge loss is introduced to be used as the loss function of SVM.It avoids the issue of nondifferentiability at the zero point encountered by the traditional hinge loss function during gradient descent optimization.Secondly,the L_(1) regularization method is employed to sparsify features and enhance the robustness of the model.Finally,adaptive proximal stochastic gradient descent(PGD)with momentum,and distributed adaptive PGDwithmomentum(DPGD)are proposed and applied to the L_(1)-Smooth SVM.Distributed computing is crucial in large-scale data analysis,with its value manifested in extending algorithms to distributed clusters,thus enabling more efficient processing ofmassive amounts of data.The DPGD algorithm leverages Spark,enabling full utilization of the computer’s multi-core resources.Due to its sparsity induced by L_(1) regularization on parameters,it exhibits significantly accelerated convergence speed.From the perspective of loss reduction,DPGD converges faster than PGD.The experimental results show that adaptive PGD withmomentumand its variants have achieved cutting-edge accuracy and efficiency in brain tumor detection.Frompre-trained models,both the PGD andDPGD outperform other models,boasting an accuracy of 95.21%. 展开更多
关键词 Support vector machine proximal stochastic gradient descent brain tumor detection distributed computing
下载PDF
Fractional Gradient Descent RBFNN for Active Fault-Tolerant Control of Plant Protection UAVs
2
作者 Lianghao Hua Jianfeng Zhang +1 位作者 Dejie Li Xiaobo Xi 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第3期2129-2157,共29页
With the increasing prevalence of high-order systems in engineering applications, these systems often exhibitsignificant disturbances and can be challenging to model accurately. As a result, the active disturbance rej... With the increasing prevalence of high-order systems in engineering applications, these systems often exhibitsignificant disturbances and can be challenging to model accurately. As a result, the active disturbance rejectioncontroller (ADRC) has been widely applied in various fields. However, in controlling plant protection unmannedaerial vehicles (UAVs), which are typically large and subject to significant disturbances, load disturbances andthe possibility of multiple actuator faults during pesticide spraying pose significant challenges. To address theseissues, this paper proposes a novel fault-tolerant control method that combines a radial basis function neuralnetwork (RBFNN) with a second-order ADRC and leverages a fractional gradient descent (FGD) algorithm.We integrate the plant protection UAV model’s uncertain parameters, load disturbance parameters, and actuatorfault parameters and utilize the RBFNN for system parameter identification. The resulting ADRC exhibits loaddisturbance suppression and fault tolerance capabilities, and our proposed active fault-tolerant control law hasLyapunov stability implications. Experimental results obtained using a multi-rotor fault-tolerant test platformdemonstrate that the proposed method outperforms other control strategies regarding load disturbance suppressionand fault-tolerant performance. 展开更多
关键词 Radial basis function neural network plant protection unmanned aerial vehicle active disturbance rejection controller fractional gradient descent algorithm
下载PDF
Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
3
作者 Wes Whiting Bao Wang Jack Xin 《Communications on Applied Mathematics and Computation》 EI 2024年第2期1175-1188,共14页
We prove,under mild conditions,the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model,both in batch gradient descent and stochastic gradient descent.We also discuss a ... We prove,under mild conditions,the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model,both in batch gradient descent and stochastic gradient descent.We also discuss a Riemannian version of the Adam algorithm.We show numerical simulations of these algorithms on various benchmarks. 展开更多
关键词 Hyperbolic neural network Riemannian gradient descent Riemannian Adam(RAdam) Training convergence
下载PDF
Rockburst Intensity Grade Prediction Model Based on Batch Gradient Descent and Multi-Scale Residual Deep Neural Network
4
作者 Yu Zhang Mingkui Zhang +1 位作者 Jitao Li Guangshu Chen 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1987-2006,共20页
Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices ... Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices without ejection,while severe rockburst causes casualties and property loss.The frequency and degree of rockburst damage increases with the excavation depth.Moreover,rockburst is the leading engineering geological hazard in the excavation process,and thus the prediction of its intensity grade is of great significance to the development of geotechnical engineering.Therefore,the prediction of rockburst intensity grade is one problem that needs to be solved urgently.By comprehensively considering the occurrence mechanism of rockburst,this paper selects the stress index(σθ/σc),brittleness index(σ_(c)/σ_(t)),and rock elastic energy index(Wet)as the rockburst evaluation indexes through the Spearman coefficient method.This overcomes the low accuracy problem of a single evaluation index prediction method.Following this,the BGD-MSR-DNN rockburst intensity grade prediction model based on batch gradient descent and a multi-scale residual deep neural network is proposed.The batch gradient descent(BGD)module is used to replace the gradient descent algorithm,which effectively improves the efficiency of the network and reduces the model training time.Moreover,the multi-scale residual(MSR)module solves the problem of network degradation when there are too many hidden layers of the deep neural network(DNN),thus improving the model prediction accuracy.The experimental results reveal the BGDMSR-DNN model accuracy to reach 97.1%,outperforming other comparable models.Finally,actual projects such as Qinling Tunnel and Daxiangling Tunnel,reached an accuracy of 100%.The model can be applied in mines and tunnel engineering to realize the accurate and rapid prediction of rockburst intensity grade. 展开更多
关键词 Rockburst prediction rockburst intensity grade deep neural network batch gradient descent multi-scale residual
下载PDF
Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning 被引量:5
5
作者 Xin Luo Wen Qin +2 位作者 Ani Dong Khaled Sedraoui MengChu Zhou 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第2期402-411,共10页
A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and... A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and scalability when handling large-scale industrial problems.Aiming at addressing this issue,this study proposes a momentum-incorporated parallel stochastic gradient descent(MPSGD)algorithm,whose main idea is two-fold:a)implementing parallelization via a novel datasplitting strategy,and b)accelerating convergence rate by integrating momentum effects into its training process.With it,an MPSGD-based latent factor(MLF)model is achieved,which is capable of performing efficient and high-quality recommendations.Experimental results on four high-dimensional and sparse matrices generated by industrial RS indicate that owing to an MPSGD algorithm,an MLF model outperforms the existing state-of-the-art ones in both computational efficiency and scalability. 展开更多
关键词 Big data industrial application industrial data latent factor analysis machine learning parallel algorithm recommender system(RS) stochastic gradient descent(SGD)
下载PDF
PROJECTED GRADIENT DESCENT BASED ON SOFT THRESHOLDING IN MATRIX COMPLETION 被引量:1
6
作者 Zhao Yujuan Zheng Baoyu Chen Shouning 《Journal of Electronics(China)》 2013年第6期517-524,共8页
Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermin... Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermined equations based on sparsity prior in singular values set of the unknown matrix,which also calls low-rank prior of the unknown matrix.This paper firstly introduces basic concept of matrix completion,analyses the matrix suitably used in matrix completion,and shows that such matrix should satisfy two conditions:low rank and incoherence property.Then the paper provides three reconstruction algorithms commonly used in matrix completion:singular value thresholding algorithm,singular value projection,and atomic decomposition for minimum rank approximation,puts forward their shortcoming to know the rank of original matrix.The Projected Gradient Descent based on Soft Thresholding(STPGD),proposed in this paper predicts the rank of unknown matrix using soft thresholding,and iteratives based on projected gradient descent,thus it could estimate the rank of unknown matrix exactly with low computational complexity,this is verified by numerical experiments.We also analyze the convergence and computational complexity of the STPGD algorithm,point out this algorithm is guaranteed to converge,and analyse the number of iterations needed to reach reconstruction error.Compared the computational complexity of the STPGD algorithm to other algorithms,we draw the conclusion that the STPGD algorithm not only reduces the computational complexity,but also improves the precision of the reconstruction solution. 展开更多
关键词 Matrix Completion (MC) Compressed Sensing (CS) Iterative thresholding algorithm Projected gradient descent based on Soft Thresholding (STPGD)
下载PDF
An Efficient Energy Routing Protocol Based on Gradient Descent Method in WSNs 被引量:1
7
作者 Ru Jin Xinlian Zhou Yue Wang 《Journal of Information Hiding and Privacy Protection》 2020年第3期115-123,共9页
In a wireless sensor network[1],the operation of a node depends on the battery power it carries.Because of the environmental reasons,the node cannot replace the battery.In order to improve the life cycle of the networ... In a wireless sensor network[1],the operation of a node depends on the battery power it carries.Because of the environmental reasons,the node cannot replace the battery.In order to improve the life cycle of the network,energy becomes one of the key problems in the design of the wireless sensor network(WSN)routing protocol[2].This paper proposes a routing protocol ERGD based on the method of gradient descent that can minimizes the consumption of energy.Within the communication radius of the current node,the distance between the current node and the next hop node is assumed that can generate a projected energy at the distance from the current node to the base station(BS),this projected energy and the remaining energy of the next hop node is the key factor in finding the next hop node.The simulation results show that the proposed protocol effectively extends the life cycle of the network and improves the reliability and fault tolerance of the system. 展开更多
关键词 Wireless sensor network gradient descent residual energy communication radius network life cycle
下载PDF
Gradient Descent Algorithm for Small UAV Parameter Estimation System
8
作者 Guo Jiandong Liu Qingwen Wang Kang 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2017年第6期680-687,共8页
A gradient descent algorithm with adjustable parameter for attitude estimation is developed,aiming at the attitude measurement for small unmanned aerial vehicle(UAV)in real-time flight conditions.The accelerometer and... A gradient descent algorithm with adjustable parameter for attitude estimation is developed,aiming at the attitude measurement for small unmanned aerial vehicle(UAV)in real-time flight conditions.The accelerometer and magnetometer are introduced to construct an error equation with the gyros,thus the drifting characteristics of gyroscope can be compensated by solving the error equation utilized by the gradient descent algorithm.Performance of the presented algorithm is evaluated using a self-proposed micro-electro-mechanical system(MEMS)based attitude heading reference system which is mounted on a tri-axis turntable.The on-ground,turntable and flight experiments indicate that the estimation attitude has a good accuracy.Also,the presented system is compared with an open-source flight control system which runs extended Kalman filter(EKF),and the results show that the attitude control system using the gradient descent method can estimate the attitudes for UAV effectively. 展开更多
关键词 gradient descent algorithm attitude estimation QUATERNIONS small unmanned aerial vehicle(UAV)
下载PDF
Designing fuzzy inference system based on improved gradient descent method
9
作者 Zhang Liquan Shao Cheng 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2006年第4期853-857,863,共6页
The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and e... The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and extrapolating missing rules, by means of confidence measure and the improved gradient descent method. The proposed approach can not only identify fuzzy model, update its parameters and determine optimal output fuzzy sets simultaneously, but also resolve the uncontrollable problem led by the regions that data do not cover. The simulation results show the effectiveness and accuracy of the proposed approach with the classical truck backer-upper control problem verifying. 展开更多
关键词 data mining fuzzy system gradient descent method missing rule.
下载PDF
New Diamond Block Based Gradient Descent Search Algorithm for Motion Estimation in the MPEG- 4 Encoder
10
作者 王振洲 李桂苓 《Transactions of Tianjin University》 EI CAS 2003年第3期202-205,共4页
Motion estimation is an important part of the MPEG- 4 encoder, due to its significant impact on the bit rate and the output quality of the encoder sequence. Unfortunately this feature takes a significant part of the e... Motion estimation is an important part of the MPEG- 4 encoder, due to its significant impact on the bit rate and the output quality of the encoder sequence. Unfortunately this feature takes a significant part of the encoding time especially when the straightforward full search(FS) algorithm is used. In this paper, a new algorithm named diamond block based gradient descent search (DBBGDS) algorithm, which is significantly faster than FS and gives similar quality of the output sequence, is proposed. At the same time, some other algorithms, such as three step search (TSS), improved three step search (ITSS), new three step search (NTSS), four step search (4SS), cellular search (CS) , diamond search (DS) and block based gradient descent search (BBGDS), are adopted and compared with DBBGDS. As the experimental results show, DBBGDS has its own advantages. Although DS has been adopted by the MPEG- 4 VM, its output sequence quality is worse than that of the proposed algorithm while its complexity is similar to the proposed one. Compared with BBGDS, the proposed algorithm can achieve a better output quality. 展开更多
关键词 MPEG motion estimation full search(FS) block based gradient descent search(BBGDS) diamond search(DS) new three step search(NTSS)
下载PDF
EFFICIENT GRADIENT DESCENT METHOD OFRBF NEURAL ENTWORKS WITHADAPTIVE LEARNING RATE
11
作者 Lin Jiayu Liu Ying(School of Electro. Sci. and Tech., National Univ. of Defence Technology, Changsha 410073) 《Journal of Electronics(China)》 2002年第3期255-258,共4页
A new algorithm to exploit the learning rates of gradient descent method is presented, based on the second-order Taylor expansion of the error energy function with respect to learning rate, at some values decided by &... A new algorithm to exploit the learning rates of gradient descent method is presented, based on the second-order Taylor expansion of the error energy function with respect to learning rate, at some values decided by "award-punish" strategy. Detailed deduction of the algorithm applied to RBF networks is given. Simulation studies show that this algorithm can increase the rate of convergence and improve the performance of the gradient descent method. 展开更多
关键词 gradient descent method Learning rate RBF neural networks
下载PDF
Linear Regression and Gradient Descent Method for Electricity Output Power Prediction
12
作者 Yuanliang Liao 《Journal of Computer and Communications》 2019年第12期31-36,共6页
Regulating the power output for a power plant as demand for electricity fluctuates throughout the day is important for both economic purpose and the safety of the generator. In this work, gradient descent method toget... Regulating the power output for a power plant as demand for electricity fluctuates throughout the day is important for both economic purpose and the safety of the generator. In this work, gradient descent method together with regularization is investigated to study the electricity output related to vacuum level and temperature in the turbine. Ninety percent of the data was used to train the regression parameters while the remaining ten percent was used for validation. Final results showed that 99% accuracy could be obtained with this method. This opens a new window for electricity output prediction for power plants. 展开更多
关键词 Machine Learning LINEAR ALGEBRA LINEAR Regression gradient descent Error Analysis
下载PDF
Pure quantum gradient descent algorithm and full quantum variational eigensolver
13
作者 Ronghang Chen Zhou Guang +2 位作者 Cong Guo Guanru Feng Shi-Yao Hou 《Frontiers of physics》 SCIE CSCD 2024年第2期221-234,共14页
Optimization problems are prevalent in various fields,and the gradient-based gradient descent algorithm is a widely adopted optimization method.However,in classical computing,computing the numerical gradient for a fun... Optimization problems are prevalent in various fields,and the gradient-based gradient descent algorithm is a widely adopted optimization method.However,in classical computing,computing the numerical gradient for a function with variables necessitates at least d+1 function evaluations,resulting in a computational complexity of O(d).As the number of variables increases,the classical gradient estimation methods require substantial resources,ultimately surpassing the capabilities of classical computers.Fortunately,leveraging the principles of superposition and entanglement in quantum mechanics,quantum computers can achieve genuine parallel computing,leading to exponential acceleration over classical algorithms in some cases.In this paper,we propose a novel quantum-based gradient calculation method that requires only a single oracle calculation to obtain the numerical gradient result for a multivariate function.The complexity of this algorithm is just O(1).Building upon this approach,we successfully implemented the quantum gradient descent algorithm and applied it to the variational quantum eigensolver(VQE),creating a pure quantum variational optimization algorithm.Compared with classical gradient-based optimization algorithm,this quantum optimization algorithm has remarkable complexity advantages,providing an efficient solution to optimization problems.The proposed quantum-based method shows promise in enhancing the performance of optimization algorithms,highlighting the potential of quantum computing in this field. 展开更多
关键词 quantum algorithm gradient descent variational quantum algorithm
原文传递
Fractional Order Iteration for Gradient Descent Method Based on Event-Triggered Mechanism
14
作者 LU Jiajie WANG Yong FAN Yuan 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2023年第5期1927-1948,共22页
In this work,a novel gradient descent method based on event-triggered strategy has been proposed,which involves integer and fractional order iteration.Firstly,the convergence of integer order iterative optimization me... In this work,a novel gradient descent method based on event-triggered strategy has been proposed,which involves integer and fractional order iteration.Firstly,the convergence of integer order iterative optimization method and the stability of its associated system with integrator dynamics are linked.Based on this result,a fractional order iteration approach has been developed by modelling the system with fractional order dynamics.Secondly,to reduce the comsumption of computation,a feedback based event-triggered mechanism has been introduced to the gradient descent method.The convergence of this new event-triggered optimization algorithm is guaranteed by using a Lyapunov method,and Zeno behavior is proved to be avoided simultaneously.Lastly,the effectiveness and advantages of the proposed algorithms are verified by numerical simulations. 展开更多
关键词 Event-triggered mechanism fractional order iteration gradient descent Zeno behavior
原文传递
A Stochastic Gradient Descent Method for Computational Design of Random Rough Surfaces in Solar Cells
15
作者 Qiang Li Gang Bao +1 位作者 Yanzhao Cao Junshan Lin 《Communications in Computational Physics》 SCIE 2023年第10期1361-1390,共30页
In this work,we develop a stochastic gradient descent method for the computational optimal design of random rough surfaces in thin-film solar cells.We formulate the design problems as random PDE-constrained optimizati... In this work,we develop a stochastic gradient descent method for the computational optimal design of random rough surfaces in thin-film solar cells.We formulate the design problems as random PDE-constrained optimization problems and seek the optimal statistical parameters for the random surfaces.The optimizations at fixed frequency as well as at multiple frequencies and multiple incident angles are investigated.To evaluate the gradient of the objective function,we derive the shape derivatives for the interfaces and apply the adjoint state method to perform the computation.The stochastic gradient descent method evaluates the gradient of the objective function only at a few samples for each iteration,which reduces the computational cost significantly.Various numerical experiments are conducted to illustrate the efficiency of the method and significant increases of the absorptance for the optimal random structures.We also examine the convergence of the stochastic gradient descent algorithm theoretically and prove that the numerical method is convergent under certain assumptions for the random interfaces. 展开更多
关键词 Optimal design random rough surface solar cell Helmholtz equation stochastic gradient descent method
原文传递
A Gradient Descent Method for Estimating the Markov Chain Choice Model
16
作者 Lei Fu Dong-Dong Ge 《Journal of the Operations Research Society of China》 EI CSCD 2023年第2期371-381,共11页
In this paper,we propose a gradient descent method to estimate the parameters in a Markov chain choice model.Particularly,we derive closed-form formula for the gradient of the log-likelihood function and show the conv... In this paper,we propose a gradient descent method to estimate the parameters in a Markov chain choice model.Particularly,we derive closed-form formula for the gradient of the log-likelihood function and show the convergence of the algorithm.Numerical experiments verify the efficiency of our approach by comparing with the expectation-maximization algorithm.We show that the similar result can be extended to a more general case that one does not have observation of the no-purchase data. 展开更多
关键词 Markov chain choice model Parameter estimation gradient descent method
原文传递
Stochastic Gradient Compression for Federated Learning over Wireless Network
17
作者 Lin Xiaohan Liu Yuan +2 位作者 Chen Fangjiong Huang Yang Ge Xiaohu 《China Communications》 SCIE CSCD 2024年第4期230-247,共18页
As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dim... As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dimensional stochastic gradients to edge server in training,which cause severe communication bottleneck.To address this problem,we compress the communication by sparsifying and quantizing the stochastic gradients of edge devices.We first derive a closed form of the communication compression in terms of sparsification and quantization factors.Then,the convergence rate of this communicationcompressed system is analyzed and several insights are obtained.Finally,we formulate and deal with the quantization resource allocation problem for the goal of minimizing the convergence upper bound,under the constraint of multiple-access channel capacity.Simulations show that the proposed scheme outperforms the benchmarks. 展开更多
关键词 federated learning gradient compression quantization resource allocation stochastic gradient descent(SGD)
下载PDF
Anderson Acceleration of Gradient Methods with Energy for Optimization Problems
18
作者 Hailiang Liu Jia-Hao He Xuping Tian 《Communications on Applied Mathematics and Computation》 EI 2024年第2期1299-1318,共20页
Anderson acceleration(AA)is an extrapolation technique designed to speed up fixed-point iterations.For optimization problems,we propose a novel algorithm by combining the AA with the energy adaptive gradient method(AE... Anderson acceleration(AA)is an extrapolation technique designed to speed up fixed-point iterations.For optimization problems,we propose a novel algorithm by combining the AA with the energy adaptive gradient method(AEGD)[arXiv:2010.05109].The feasibility of our algorithm is ensured in light of the convergence theory for AEGD,though it is not a fixed-point iteration.We provide rigorous convergence rates of AA for gradient descent(GD)by an acceleration factor of the gain at each implementation of AA-GD.Our experimental results show that the proposed AA-AEGD algorithm requires little tuning of hyperparameters and exhibits superior fast convergence. 展开更多
关键词 Anderson acceleration(AA) gradient descent(GD) Energy stability
下载PDF
Convergence of Stochastic Gradient Descent in Deep Neural Network 被引量:4
19
作者 Bai-cun ZHOU Cong-ying HAN Tian-de GUO 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2021年第1期126-136,共11页
Stochastic gradient descent(SGD) is one of the most common optimization algorithms used in pattern recognition and machine learning.This algorithm and its variants are the preferred algorithm while optimizing paramete... Stochastic gradient descent(SGD) is one of the most common optimization algorithms used in pattern recognition and machine learning.This algorithm and its variants are the preferred algorithm while optimizing parameters of deep neural network for their advantages of low storage space requirement and fast computation speed.Previous studies on convergence of these algorithms were based on some traditional assumptions in optimization problems.However,the deep neural network has its unique properties.Some assumptions are inappropriate in the actual optimization process of this kind of model.In this paper,we modify the assumptions to make them more consistent with the actual optimization process of deep neural network.Based on new assumptions,we studied the convergence and convergence rate of SGD and its two common variant algorithms.In addition,we carried out numerical experiments with LeNet-5,a common network framework,on the data set MNIST to verify the rationality of our assumptions. 展开更多
关键词 stochastic gradient descent deep neural network CONVERGENCE
原文传递
Convergence analysis of projected gradient descent for Schatten-p nonconvex matrix recovery 被引量:2
20
作者 CAI Yun LI Song 《Science China Mathematics》 SCIE CSCD 2015年第4期845-858,共14页
The matrix rank minimization problem arises in many engineering applications. As this problem is NP-hard, a nonconvex relaxation of matrix rank minimization, called the Schatten-p quasi-norm minimization(0 < p <... The matrix rank minimization problem arises in many engineering applications. As this problem is NP-hard, a nonconvex relaxation of matrix rank minimization, called the Schatten-p quasi-norm minimization(0 < p < 1), has been developed to approximate the rank function closely. We study the performance of projected gradient descent algorithm for solving the Schatten-p quasi-norm minimization(0 < p < 1) problem.Based on the matrix restricted isometry property(M-RIP), we give the convergence guarantee and error bound for this algorithm and show that the algorithm is robust to noise with an exponential convergence rate. 展开更多
关键词 low rank matrix recovery nonconvex matrix recovery projected gradient descent restricted isometry property
原文传递
上一页 1 2 74 下一页 到第
使用帮助 返回顶部