期刊文献+
共找到1,512篇文章
< 1 2 76 >
每页显示 20 50 100
L_(1)-Smooth SVM with Distributed Adaptive Proximal Stochastic Gradient Descent with Momentum for Fast Brain Tumor Detection
1
作者 Chuandong Qin Yu Cao Liqun Meng 《Computers, Materials & Continua》 SCIE EI 2024年第5期1975-1994,共20页
Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for ga... Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for gaining a deeper understanding of tumors and improving treatment outcomes.Machine learning models have become key players in automating brain tumor detection.Gradient descent methods are the mainstream algorithms for solving machine learning models.In this paper,we propose a novel distributed proximal stochastic gradient descent approach to solve the L_(1)-Smooth Support Vector Machine(SVM)classifier for brain tumor detection.Firstly,the smooth hinge loss is introduced to be used as the loss function of SVM.It avoids the issue of nondifferentiability at the zero point encountered by the traditional hinge loss function during gradient descent optimization.Secondly,the L_(1) regularization method is employed to sparsify features and enhance the robustness of the model.Finally,adaptive proximal stochastic gradient descent(PGD)with momentum,and distributed adaptive PGDwithmomentum(DPGD)are proposed and applied to the L_(1)-Smooth SVM.Distributed computing is crucial in large-scale data analysis,with its value manifested in extending algorithms to distributed clusters,thus enabling more efficient processing ofmassive amounts of data.The DPGD algorithm leverages Spark,enabling full utilization of the computer’s multi-core resources.Due to its sparsity induced by L_(1) regularization on parameters,it exhibits significantly accelerated convergence speed.From the perspective of loss reduction,DPGD converges faster than PGD.The experimental results show that adaptive PGD withmomentumand its variants have achieved cutting-edge accuracy and efficiency in brain tumor detection.Frompre-trained models,both the PGD andDPGD outperform other models,boasting an accuracy of 95.21%. 展开更多
关键词 Support vector machine proximal stochastic gradient descent brain tumor detection distributed computing
下载PDF
Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
2
作者 Wes Whiting Bao Wang Jack Xin 《Communications on Applied Mathematics and Computation》 EI 2024年第2期1175-1188,共14页
We prove,under mild conditions,the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model,both in batch gradient descent and stochastic gradient descent.We also discuss a ... We prove,under mild conditions,the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model,both in batch gradient descent and stochastic gradient descent.We also discuss a Riemannian version of the Adam algorithm.We show numerical simulations of these algorithms on various benchmarks. 展开更多
关键词 Hyperbolic neural network Riemannian gradient descent Riemannian Adam(RAdam) Training convergence
下载PDF
Fractional Gradient Descent RBFNN for Active Fault-Tolerant Control of Plant Protection UAVs
3
作者 Lianghao Hua Jianfeng Zhang +1 位作者 Dejie Li Xiaobo Xi 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第3期2129-2157,共29页
With the increasing prevalence of high-order systems in engineering applications, these systems often exhibitsignificant disturbances and can be challenging to model accurately. As a result, the active disturbance rej... With the increasing prevalence of high-order systems in engineering applications, these systems often exhibitsignificant disturbances and can be challenging to model accurately. As a result, the active disturbance rejectioncontroller (ADRC) has been widely applied in various fields. However, in controlling plant protection unmannedaerial vehicles (UAVs), which are typically large and subject to significant disturbances, load disturbances andthe possibility of multiple actuator faults during pesticide spraying pose significant challenges. To address theseissues, this paper proposes a novel fault-tolerant control method that combines a radial basis function neuralnetwork (RBFNN) with a second-order ADRC and leverages a fractional gradient descent (FGD) algorithm.We integrate the plant protection UAV model’s uncertain parameters, load disturbance parameters, and actuatorfault parameters and utilize the RBFNN for system parameter identification. The resulting ADRC exhibits loaddisturbance suppression and fault tolerance capabilities, and our proposed active fault-tolerant control law hasLyapunov stability implications. Experimental results obtained using a multi-rotor fault-tolerant test platformdemonstrate that the proposed method outperforms other control strategies regarding load disturbance suppressionand fault-tolerant performance. 展开更多
关键词 Radial basis function neural network plant protection unmanned aerial vehicle active disturbance rejection controller fractional gradient descent algorithm
下载PDF
Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning 被引量:5
4
作者 Xin Luo Wen Qin +2 位作者 Ani Dong Khaled Sedraoui MengChu Zhou 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第2期402-411,共10页
A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and... A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and scalability when handling large-scale industrial problems.Aiming at addressing this issue,this study proposes a momentum-incorporated parallel stochastic gradient descent(MPSGD)algorithm,whose main idea is two-fold:a)implementing parallelization via a novel datasplitting strategy,and b)accelerating convergence rate by integrating momentum effects into its training process.With it,an MPSGD-based latent factor(MLF)model is achieved,which is capable of performing efficient and high-quality recommendations.Experimental results on four high-dimensional and sparse matrices generated by industrial RS indicate that owing to an MPSGD algorithm,an MLF model outperforms the existing state-of-the-art ones in both computational efficiency and scalability. 展开更多
关键词 Big data industrial application industrial data latent factor analysis machine learning parallel algorithm recommender system(RS) stochastic gradient descent(Sgd)
下载PDF
A modified three–term conjugate gradient method with sufficient descent property 被引量:1
5
作者 Saman Babaie–Kafaki 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2015年第3期263-272,共10页
A hybridization of the three–term conjugate gradient method proposed by Zhang et al. and the nonlinear conjugate gradient method proposed by Polak and Ribi`ere, and Polyak is suggested. Based on an eigenvalue analysi... A hybridization of the three–term conjugate gradient method proposed by Zhang et al. and the nonlinear conjugate gradient method proposed by Polak and Ribi`ere, and Polyak is suggested. Based on an eigenvalue analysis, it is shown that search directions of the proposed method satisfy the sufficient descent condition, independent of the line search and the objective function convexity. Global convergence of the method is established under an Armijo–type line search condition. Numerical experiments show practical efficiency of the proposed method. 展开更多
关键词 unconstrained optimization conjugate gradient method EIGENVALUE sufficient descent condition global convergence
下载PDF
PROJECTED GRADIENT DESCENT BASED ON SOFT THRESHOLDING IN MATRIX COMPLETION 被引量:1
6
作者 Zhao Yujuan Zheng Baoyu Chen Shouning 《Journal of Electronics(China)》 2013年第6期517-524,共8页
Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermin... Matrix completion is the extension of compressed sensing.In compressed sensing,we solve the underdetermined equations using sparsity prior of the unknown signals.However,in matrix completion,we solve the underdetermined equations based on sparsity prior in singular values set of the unknown matrix,which also calls low-rank prior of the unknown matrix.This paper firstly introduces basic concept of matrix completion,analyses the matrix suitably used in matrix completion,and shows that such matrix should satisfy two conditions:low rank and incoherence property.Then the paper provides three reconstruction algorithms commonly used in matrix completion:singular value thresholding algorithm,singular value projection,and atomic decomposition for minimum rank approximation,puts forward their shortcoming to know the rank of original matrix.The Projected Gradient Descent based on Soft Thresholding(STPGD),proposed in this paper predicts the rank of unknown matrix using soft thresholding,and iteratives based on projected gradient descent,thus it could estimate the rank of unknown matrix exactly with low computational complexity,this is verified by numerical experiments.We also analyze the convergence and computational complexity of the STPGD algorithm,point out this algorithm is guaranteed to converge,and analyse the number of iterations needed to reach reconstruction error.Compared the computational complexity of the STPGD algorithm to other algorithms,we draw the conclusion that the STPGD algorithm not only reduces the computational complexity,but also improves the precision of the reconstruction solution. 展开更多
关键词 Matrix Completion (MC) Compressed Sensing (CS) Iterative thresholding algorithm Projected gradient descent based on Soft Thresholding (STPgd)
下载PDF
An Efficient Energy Routing Protocol Based on Gradient Descent Method in WSNs 被引量:1
7
作者 Ru Jin Xinlian Zhou Yue Wang 《Journal of Information Hiding and Privacy Protection》 2020年第3期115-123,共9页
In a wireless sensor network[1],the operation of a node depends on the battery power it carries.Because of the environmental reasons,the node cannot replace the battery.In order to improve the life cycle of the networ... In a wireless sensor network[1],the operation of a node depends on the battery power it carries.Because of the environmental reasons,the node cannot replace the battery.In order to improve the life cycle of the network,energy becomes one of the key problems in the design of the wireless sensor network(WSN)routing protocol[2].This paper proposes a routing protocol ERGD based on the method of gradient descent that can minimizes the consumption of energy.Within the communication radius of the current node,the distance between the current node and the next hop node is assumed that can generate a projected energy at the distance from the current node to the base station(BS),this projected energy and the remaining energy of the next hop node is the key factor in finding the next hop node.The simulation results show that the proposed protocol effectively extends the life cycle of the network and improves the reliability and fault tolerance of the system. 展开更多
关键词 Wireless sensor network gradient descent residual energy communication radius network life cycle
下载PDF
求解一类非光滑凸优化问题的相对加速SGD算法
8
作者 张文娟 冯象初 +2 位作者 肖锋 黄姝娟 李欢 《西安电子科技大学学报》 EI CAS CSCD 北大核心 2024年第3期147-157,共11页
一阶优化算法由于其计算简单、代价小,被广泛应用于机器学习、大数据科学、计算机视觉等领域,然而,现有的一阶算法大多要求目标函数具有Lipschitz连续梯度,而实际中的很多应用问题不满足该要求。在经典的梯度下降算法基础上,引入随机和... 一阶优化算法由于其计算简单、代价小,被广泛应用于机器学习、大数据科学、计算机视觉等领域,然而,现有的一阶算法大多要求目标函数具有Lipschitz连续梯度,而实际中的很多应用问题不满足该要求。在经典的梯度下降算法基础上,引入随机和加速,提出一种相对加速随机梯度下降算法。该算法不要求目标函数具有Lipschitz连续梯度,而是通过将欧氏距离推广为Bregman距离,从而将Lipschitz连续梯度条件减弱为相对光滑性条件。相对加速随机梯度下降算法的收敛性与一致三角尺度指数有关,为避免调节最优一致三角尺度指数参数的工作量,给出一种自适应相对加速随机梯度下降算法。该算法可自适应地选取一致三角尺度指数参数。对算法收敛性的理论分析表明,算法迭代序列的目标函数值收敛于最优目标函数值。针对Possion反问题和目标函数的Hessian阵算子范数随变量范数多项式增长的极小化问题的数值实验表明,自适应相对加速随机梯度下降算法和相对加速随机梯度下降算法的收敛性能优于相对随机梯度下降算法。 展开更多
关键词 凸优化 非光滑优化 相对光滑 随机规划 梯度方法 加速随机梯度下降
下载PDF
Designing fuzzy inference system based on improved gradient descent method
9
作者 Zhang Liquan Shao Cheng 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2006年第4期853-857,863,共6页
The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and e... The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and extrapolating missing rules, by means of confidence measure and the improved gradient descent method. The proposed approach can not only identify fuzzy model, update its parameters and determine optimal output fuzzy sets simultaneously, but also resolve the uncontrollable problem led by the regions that data do not cover. The simulation results show the effectiveness and accuracy of the proposed approach with the classical truck backer-upper control problem verifying. 展开更多
关键词 data mining fuzzy system gradient descent method missing rule.
下载PDF
Gradient Descent Algorithm for Small UAV Parameter Estimation System
10
作者 Guo Jiandong Liu Qingwen Wang Kang 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2017年第6期680-687,共8页
A gradient descent algorithm with adjustable parameter for attitude estimation is developed,aiming at the attitude measurement for small unmanned aerial vehicle(UAV)in real-time flight conditions.The accelerometer and... A gradient descent algorithm with adjustable parameter for attitude estimation is developed,aiming at the attitude measurement for small unmanned aerial vehicle(UAV)in real-time flight conditions.The accelerometer and magnetometer are introduced to construct an error equation with the gyros,thus the drifting characteristics of gyroscope can be compensated by solving the error equation utilized by the gradient descent algorithm.Performance of the presented algorithm is evaluated using a self-proposed micro-electro-mechanical system(MEMS)based attitude heading reference system which is mounted on a tri-axis turntable.The on-ground,turntable and flight experiments indicate that the estimation attitude has a good accuracy.Also,the presented system is compared with an open-source flight control system which runs extended Kalman filter(EKF),and the results show that the attitude control system using the gradient descent method can estimate the attitudes for UAV effectively. 展开更多
关键词 gradient descent algorithm attitude estimation QUATERNIONS small unmanned aerial vehicle(UAV)
下载PDF
激光相干合成系统中SPGD算法的分阶段自适应优化
11
作者 郑文慧 祁家琴 +6 位作者 江文隽 谭贵元 胡奇琪 高怀恩 豆嘉真 邸江磊 秦玉文 《红外与激光工程》 EI CSCD 北大核心 2024年第9期303-315,共13页
为改善传统随机并行梯度下降(Stochastic Parallel Gradient Descent,SPGD)算法应用于大规模激光相干合成系统时收敛速度慢且易陷入局部最优解的情况,提出了一种分阶段自适应增益SPGD算法-Staged SPGD算法。该算法根据性能评价函数值,... 为改善传统随机并行梯度下降(Stochastic Parallel Gradient Descent,SPGD)算法应用于大规模激光相干合成系统时收敛速度慢且易陷入局部最优解的情况,提出了一种分阶段自适应增益SPGD算法-Staged SPGD算法。该算法根据性能评价函数值,在不同收敛时期采用不同策略对增益系数进行自适应调整,同时引入含梯度更新因子的控制电压更新策略,在加快收敛速度的同时减少算法陷入局部极值的概率。实验结果表明:在19路激光相干合成系统中,与传统SPGD算法相比,Staged SPGD算法的收敛速度提升了36.84%,针对不同频率和幅度的相位噪声,算法也具有较优的收敛效果,且稳定性得到显著提升。此外,将Staged SPGD算法直接应用于37、61、91路相干合成系统时,Staged SPGD算法相比传统SPGD算法收敛速度分别提升了37.88%、40.85%和41.10%,提升效果随相干合成单元数增加而更加显著,表明该算法在收敛速度、稳定性和扩展性方面均具有一定优势,具备扩展到大规模相干合成系统的潜力。 展开更多
关键词 激光相干合成 相位控制 随机并行梯度下降算法 SPgd算法
下载PDF
A Descent Gradient Method and Its Global Convergence
12
作者 LIU Jin-kui 《Chinese Quarterly Journal of Mathematics》 CSCD 2014年第1期142-150,共9页
Y Liu and C Storey(1992)proposed the famous LS conjugate gradient method which has good numerical results.However,the LS method has very weak convergence under the Wolfe-type line search.In this paper,we give a new de... Y Liu and C Storey(1992)proposed the famous LS conjugate gradient method which has good numerical results.However,the LS method has very weak convergence under the Wolfe-type line search.In this paper,we give a new descent gradient method based on the LS method.It can guarantee the sufficient descent property at each iteration and the global convergence under the strong Wolfe line search.Finally,we also present extensive preliminary numerical experiments to show the efficiency of the proposed method by comparing with the famous PRP^+method. 展开更多
关键词 unconstrained optimization conjugate gradient method strong Wolfe line search sufficient descent property global convergence
下载PDF
Linear Regression and Gradient Descent Method for Electricity Output Power Prediction
13
作者 Yuanliang Liao 《Journal of Computer and Communications》 2019年第12期31-36,共6页
Regulating the power output for a power plant as demand for electricity fluctuates throughout the day is important for both economic purpose and the safety of the generator. In this work, gradient descent method toget... Regulating the power output for a power plant as demand for electricity fluctuates throughout the day is important for both economic purpose and the safety of the generator. In this work, gradient descent method together with regularization is investigated to study the electricity output related to vacuum level and temperature in the turbine. Ninety percent of the data was used to train the regression parameters while the remaining ten percent was used for validation. Final results showed that 99% accuracy could be obtained with this method. This opens a new window for electricity output prediction for power plants. 展开更多
关键词 Machine Learning LINEAR ALGEBRA LINEAR Regression gradient descent Error Analysis
下载PDF
A New Descent Nonlinear Conjugate Gradient Method for Unconstrained Optimization
14
作者 Hao Fan Zhibin Zhu Anwa Zhou 《Applied Mathematics》 2011年第9期1119-1123,共5页
In this paper, a new nonlinear conjugate gradient method is proposed for large-scale unconstrained optimization. The sufficient descent property holds without any line searches. We use some steplength technique which ... In this paper, a new nonlinear conjugate gradient method is proposed for large-scale unconstrained optimization. The sufficient descent property holds without any line searches. We use some steplength technique which ensures the Zoutendijk condition to be held, this method is proved to be globally convergent. Finally, we improve it, and do further analysis. 展开更多
关键词 Large Scale UNCONSTRAINED Optimization CONJUGATE gradient Method SUFFICIENT descent Property Globally CONVERGENT
下载PDF
Rockburst Intensity Grade Prediction Model Based on Batch Gradient Descent and Multi-Scale Residual Deep Neural Network
15
作者 Yu Zhang Mingkui Zhang +1 位作者 Jitao Li Guangshu Chen 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1987-2006,共20页
Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices ... Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices without ejection,while severe rockburst causes casualties and property loss.The frequency and degree of rockburst damage increases with the excavation depth.Moreover,rockburst is the leading engineering geological hazard in the excavation process,and thus the prediction of its intensity grade is of great significance to the development of geotechnical engineering.Therefore,the prediction of rockburst intensity grade is one problem that needs to be solved urgently.By comprehensively considering the occurrence mechanism of rockburst,this paper selects the stress index(σθ/σc),brittleness index(σ_(c)/σ_(t)),and rock elastic energy index(Wet)as the rockburst evaluation indexes through the Spearman coefficient method.This overcomes the low accuracy problem of a single evaluation index prediction method.Following this,the BGD-MSR-DNN rockburst intensity grade prediction model based on batch gradient descent and a multi-scale residual deep neural network is proposed.The batch gradient descent(BGD)module is used to replace the gradient descent algorithm,which effectively improves the efficiency of the network and reduces the model training time.Moreover,the multi-scale residual(MSR)module solves the problem of network degradation when there are too many hidden layers of the deep neural network(DNN),thus improving the model prediction accuracy.The experimental results reveal the BGDMSR-DNN model accuracy to reach 97.1%,outperforming other comparable models.Finally,actual projects such as Qinling Tunnel and Daxiangling Tunnel,reached an accuracy of 100%.The model can be applied in mines and tunnel engineering to realize the accurate and rapid prediction of rockburst intensity grade. 展开更多
关键词 Rockburst prediction rockburst intensity grade deep neural network batch gradient descent multi-scale residual
下载PDF
EFFICIENT GRADIENT DESCENT METHOD OFRBF NEURAL ENTWORKS WITHADAPTIVE LEARNING RATE
16
作者 Lin Jiayu Liu Ying(School of Electro. Sci. and Tech., National Univ. of Defence Technology, Changsha 410073) 《Journal of Electronics(China)》 2002年第3期255-258,共4页
A new algorithm to exploit the learning rates of gradient descent method is presented, based on the second-order Taylor expansion of the error energy function with respect to learning rate, at some values decided by &... A new algorithm to exploit the learning rates of gradient descent method is presented, based on the second-order Taylor expansion of the error energy function with respect to learning rate, at some values decided by "award-punish" strategy. Detailed deduction of the algorithm applied to RBF networks is given. Simulation studies show that this algorithm can increase the rate of convergence and improve the performance of the gradient descent method. 展开更多
关键词 gradient descent method Learning rate RBF neural networks
下载PDF
New Diamond Block Based Gradient Descent Search Algorithm for Motion Estimation in the MPEG- 4 Encoder
17
作者 王振洲 李桂苓 《Transactions of Tianjin University》 EI CAS 2003年第3期202-205,共4页
Motion estimation is an important part of the MPEG- 4 encoder, due to its significant impact on the bit rate and the output quality of the encoder sequence. Unfortunately this feature takes a significant part of the e... Motion estimation is an important part of the MPEG- 4 encoder, due to its significant impact on the bit rate and the output quality of the encoder sequence. Unfortunately this feature takes a significant part of the encoding time especially when the straightforward full search(FS) algorithm is used. In this paper, a new algorithm named diamond block based gradient descent search (DBBGDS) algorithm, which is significantly faster than FS and gives similar quality of the output sequence, is proposed. At the same time, some other algorithms, such as three step search (TSS), improved three step search (ITSS), new three step search (NTSS), four step search (4SS), cellular search (CS) , diamond search (DS) and block based gradient descent search (BBGDS), are adopted and compared with DBBGDS. As the experimental results show, DBBGDS has its own advantages. Although DS has been adopted by the MPEG- 4 VM, its output sequence quality is worse than that of the proposed algorithm while its complexity is similar to the proposed one. Compared with BBGDS, the proposed algorithm can achieve a better output quality. 展开更多
关键词 MPEG motion estimation full search(FS) block based gradient descent search(BBgdS) diamond search(DS) new three step search(NTSS)
下载PDF
Phase-only pattern synthesis based on gradient-descent optimization
18
作者 Chengjun Lu Weixing Sheng +1 位作者 Yubing Han Xiaofeng Ma 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2016年第2期297-307,共11页
By applying phase-only technique in array antenna pattern synthesis, antenna arrays can form desired patterns with the use of phase shifters only. A novel phase-only pattern synthesis algorithm is proposed for the pas... By applying phase-only technique in array antenna pattern synthesis, antenna arrays can form desired patterns with the use of phase shifters only. A novel phase-only pattern synthesis algorithm is proposed for the passive phased array seeker. This algorithm synthesizes the main beam of the antenna pattern through least-squares approximation, thus minimizing the errors between the actual and the desired main beams. The synthesis problem can be solved by applying gradient-descent optimization. The item for suppressing side lobes is added to the above synthesis problem. To obtain a side lobe level as low as possible, the algorithm assigns different weights to different directions in the side lobe region. The algorithm is run repeatedly and the weights are adjusted adaptively according to the normalized power in the side lobe directions. Detailed examples are presented to demonstrate the accuracy and effectiveness of the proposed approach. 展开更多
关键词 pattern synthesis phase-only gradient-descent adaptively
下载PDF
Anderson Acceleration of Gradient Methods with Energy for Optimization Problems
19
作者 Hailiang Liu Jia-Hao He Xuping Tian 《Communications on Applied Mathematics and Computation》 EI 2024年第2期1299-1318,共20页
Anderson acceleration(AA)is an extrapolation technique designed to speed up fixed-point iterations.For optimization problems,we propose a novel algorithm by combining the AA with the energy adaptive gradient method(AE... Anderson acceleration(AA)is an extrapolation technique designed to speed up fixed-point iterations.For optimization problems,we propose a novel algorithm by combining the AA with the energy adaptive gradient method(AEGD)[arXiv:2010.05109].The feasibility of our algorithm is ensured in light of the convergence theory for AEGD,though it is not a fixed-point iteration.We provide rigorous convergence rates of AA for gradient descent(GD)by an acceleration factor of the gain at each implementation of AA-GD.Our experimental results show that the proposed AA-AEGD algorithm requires little tuning of hyperparameters and exhibits superior fast convergence. 展开更多
关键词 Anderson acceleration(AA) gradient descent(gd) Energy stability
下载PDF
Stochastic Gradient Compression for Federated Learning over Wireless Network
20
作者 Lin Xiaohan Liu Yuan +2 位作者 Chen Fangjiong Huang Yang Ge Xiaohu 《China Communications》 SCIE CSCD 2024年第4期230-247,共18页
As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dim... As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dimensional stochastic gradients to edge server in training,which cause severe communication bottleneck.To address this problem,we compress the communication by sparsifying and quantizing the stochastic gradients of edge devices.We first derive a closed form of the communication compression in terms of sparsification and quantization factors.Then,the convergence rate of this communicationcompressed system is analyzed and several insights are obtained.Finally,we formulate and deal with the quantization resource allocation problem for the goal of minimizing the convergence upper bound,under the constraint of multiple-access channel capacity.Simulations show that the proposed scheme outperforms the benchmarks. 展开更多
关键词 federated learning gradient compression quantization resource allocation stochastic gradient descent(Sgd)
下载PDF
上一页 1 2 76 下一页 到第
使用帮助 返回顶部