期刊文献+
共找到42篇文章
< 1 2 3 >
每页显示 20 50 100
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:3
1
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 Adaptive dynamic programming(ADP) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
下载PDF
Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming 被引量:1
2
作者 Zhongyang Wang Youqing Wang Zdzisław Kowalczuk 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期131-140,共10页
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho... In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection. 展开更多
关键词 Adaptive dynamic programming(ADP) internal model principle(IMP) output feedback problem policy iteration(PI) value iteration(VI)
下载PDF
Event-based performance guaranteed tracking control for constrained nonlinear system via adaptive dynamic programming method
3
作者 Xingyi Zhang Zijie Guo +1 位作者 Hongru Ren Hongyi Li 《Journal of Automation and Intelligence》 2023年第4期239-247,共9页
An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic progra... An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme. 展开更多
关键词 Adaptive dynamic programming(ADP) Asymmetric input constraints Prescribed performance control Event-triggered control Optimal tracking control
下载PDF
Parallel Control for Optimal Tracking via Adaptive Dynamic Programming 被引量:23
4
作者 Jingwei Lu Qinglai Wei Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第6期1662-1674,共13页
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int... This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases. 展开更多
关键词 Adaptive dynamic programming(ADP) nonlinear optimal control parallel controller parallel control theory parallel system tracking control neural network(NN)
下载PDF
Residential Energy Scheduling for Variable Weather Solar Energy Based on Adaptive Dynamic Programming 被引量:15
5
作者 Derong Liu Yancai Xu +1 位作者 Qinglai Wei Xinliang Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期36-46,共11页
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener... The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost. 展开更多
关键词 Action dependent heuristic dynamic programming adaptive dynamic programming control strategy residential energy management smart grid
下载PDF
Policy iteration optimal tracking control for chaotic systems by using an adaptive dynamic programming approach 被引量:1
6
作者 魏庆来 刘德荣 徐延才 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第3期87-94,共8页
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob... A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming
下载PDF
A novel stable value iteration-based approximate dynamic programming algorithm for discrete-time nonlinear systems
7
作者 曲延华 王安娜 林盛 《Chinese Physics B》 SCIE EI CAS CSCD 2018年第1期228-235,共8页
The convergence and stability of a value-iteration-based adaptive dynamic programming (ADP) algorithm are con- sidered for discrete-time nonlinear systems accompanied by a discounted quadric performance index. More ... The convergence and stability of a value-iteration-based adaptive dynamic programming (ADP) algorithm are con- sidered for discrete-time nonlinear systems accompanied by a discounted quadric performance index. More importantly than sufficing to achieve a good approximate structure, the iterative feedback control law must guarantee the closed-loop stability. Specifically, it is firstly proved that the iterative value function sequence will precisely converge to the optimum. Secondly, the necessary and sufficient condition of the optimal value function serving as a Lyapunov function is investi- gated. We prove that for the case of infinite horizon, there exists a finite horizon length of which the iterative feedback control law will provide stability, and this increases the practicability of the proposed value iteration algorithm. Neural networks (NNs) are employed to approximate the value functions and the optimal feedback control laws, and the approach allows the implementation of the algorithm without knowing the internal dynamics of the system. Finally, a simulation example is employed to demonstrate the effectiveness of the developed optimal control method. 展开更多
关键词 adaptive dynamic programming (ADP) CONVERGENCE STABILITY discounted quadric performanceindex
下载PDF
Adaptive fault-tolerant control for non-minimum phase hypersonic vehicles based on adaptive dynamic programming 被引量:1
8
作者 Le WANG Ruiyun QI Bin JIANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第3期290-311,共22页
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on t... In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm. 展开更多
关键词 Hypersonic vehicle Fault-tolerant control Non-minimum phase system Adaptive control Nonlinear control Adaptive dynamic programming
原文传递
Value Iteration-Based Cooperative Adaptive Optimal Control for Multi-Player Differential Games With Incomplete Information
9
作者 Yun Zhang Lulu Zhang Yunze Cai 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第3期690-697,共8页
This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the l... This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the learning process and adapt their policies sequentially.Our method removes the dependence of admissible initial policies,which is one of the main drawbacks of the PI-based frameworks.Furthermore,this algorithm enables the players to adapt their control policies without full knowledge of others’ system parameters or control laws.The efficacy of our method is illustrated by three examples. 展开更多
关键词 Adaptive dynamic programming incomplete information multi-player differential game value iteration
下载PDF
Adaptive Dynamic Programming-Based Attitude Optimal Tracking Control for a Quadrotor with Unmeasured Velocities and Model Uncertainties
10
作者 Junrui Guo Xiaoyang Gao Tieshan Li 《Guidance, Navigation and Control》 2024年第2期25-46,共22页
This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is... This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is used to approximate the model uncertainties.Then,an NN velocity observer is established to estimate the unmeasured angular velocities.Further,a quadrotor output feedback attitude optimal tracking controller is designed,which consists of an adaptive controller designed by backstepping method and an optimal compensation term designed by adaptive dynamic programming.All signals in the closed-loop system are proved to be bounded.Finally,numerical simulation example shows that the quadrotor attitude tracking scheme is effective and feasible. 展开更多
关键词 QUADROTOR optimal tracking control neural network velocity observer model uncertainties adaptive dynamic programming
原文传递
Chaotic system optimal tracking using data-based synchronous method with unknown dynamics and disturbances
11
作者 宋睿卓 魏庆来 《Chinese Physics B》 SCIE EI CAS CSCD 2017年第3期268-275,共8页
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. Acco... We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration (PI) is introduced to solve the rain-max optimization problem. The off-policy adaptive dynamic programming (ADP) algorithm is then proposed to find the solution of the tracking Hamilton-Jacobi- Isaacs (HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network (CNN), action neural network (ANN), and disturbance neural network (DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded (UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem. 展开更多
关键词 adaptive dynamic programming approximate dynamic programming chaotic system ZERO-SUM
下载PDF
Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems 被引量:1
12
作者 Guangyu Zhu Xiaolu Li +2 位作者 Ranran Sun Yiyuan Yang Peng Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第3期781-791,共11页
Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iterati... Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iteration(DTTV)algorithm,is developed.The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance.The admissibility of the iterative control law is analyzed.The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution.To implement the algorithm,neural networks are employed and a new implementation structure is established,which avoids solving the generalized Bellman equation in each iteration.Finally,the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm,where the mass and pendulum bar length are permitted to be time-varying parameters.The effectiveness of the developed method is illustrated by numerical results and comparisons. 展开更多
关键词 Adaptive critic designs adaptive dynamic programming approximate dynamic programming optimal control policy iteration TIME-VARYING
下载PDF
Data-based Fault Tolerant Control for Affine Nonlinear Systems Through Particle Swarm Optimized Neural Networks 被引量:15
13
作者 Haowei Lin Bo Zhao +1 位作者 Derong Liu Cesare Alippi 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第4期954-964,共11页
In this paper, a data-based fault tolerant control(FTC) scheme is investigated for unknown continuous-time(CT)affine nonlinear systems with actuator faults. First, a neural network(NN) identifier based on particle swa... In this paper, a data-based fault tolerant control(FTC) scheme is investigated for unknown continuous-time(CT)affine nonlinear systems with actuator faults. First, a neural network(NN) identifier based on particle swarm optimization(PSO) is constructed to model the unknown system dynamics. By utilizing the estimated system states, the particle swarm optimized critic neural network(PSOCNN) is employed to solve the Hamilton-Jacobi-Bellman equation(HJBE) more efficiently.Then, a data-based FTC scheme, which consists of the NN identifier and the fault compensator, is proposed to achieve actuator fault tolerance. The stability of the closed-loop system under actuator faults is guaranteed by the Lyapunov stability theorem. Finally, simulations are provided to demonstrate the effectiveness of the developed method. 展开更多
关键词 Adaptive dynamic programming(ADP) critic neural network data-based fault tolerant control(FTC) particle swarm optimization(PSO)
下载PDF
Discounted Iterative Adaptive Critic Designs With Novel Stability Analysis for Tracking Control 被引量:8
14
作者 Mingming Ha Ding Wang Derong Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第7期1262-1272,共11页
The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of t... The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of time steps increases.In this paper,a new cost function is introduced to develop the value-iteration-based adaptive critic framework to solve the tracking control problem.Unlike the regulator problem,the iterative value function of tracking control problem cannot be regarded as a Lyapunov function.A novel stability analysis method is developed to guarantee that the tracking error converges to zero.The discounted iterative scheme under the new cost function for the special case of linear systems is elaborated.Finally,the tracking performance of the present scheme is demonstrated by numerical results and compared with those of the traditional approaches. 展开更多
关键词 Adaptive critic design adaptive dynamic programming(ADP) approximate dynamic programming discrete-time nonlinear systems reinforcement learning stability analysis tracking control value iteration(VI)
下载PDF
Neural-Network-Based Control for Discrete-Time Nonlinear Systems with Input Saturation Under Stochastic Communication Protocol 被引量:10
15
作者 Xueli Wang Derui Ding +1 位作者 Hongli Dong Xian-Ming Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第4期766-778,共13页
In this paper,an adaptive dynamic programming(ADP)strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation.To save the communication resources between th... In this paper,an adaptive dynamic programming(ADP)strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation.To save the communication resources between the controller and the actuators,stochastic communication protocols(SCPs)are adopted to schedule the control signal,and therefore the closed-loop system is essentially a protocol-induced switching system.A neural network(NN)-based identifier with a robust term is exploited for approximating the unknown nonlinear system,and a set of switch-based updating rules with an additional tunable parameter of NN weights are developed with the help of the gradient descent.By virtue of a novel Lyapunov function,a sufficient condition is proposed to achieve the stability of both system identification errors and the update dynamics of NN weights.Then,a value iterative ADP algorithm in an offline way is proposed to solve the optimal control of protocol-induced switching systems with saturation constraints,and the convergence is profoundly discussed in light of mathematical induction.Furthermore,an actor-critic NN scheme is developed to approximate the control law and the proposed performance index function in the framework of ADP,and the stability of the closed-loop system is analyzed in view of the Lyapunov theory.Finally,the numerical simulation results are presented to demonstrate the effectiveness of the proposed control scheme. 展开更多
关键词 Adaptive dynamic programming(ADP) constrained inputs neural network(NN) stochastic communication protocols(SCPs) suboptimal control
下载PDF
Adaptive dynamic programming for online solution of a zero-sum differential game 被引量:10
16
作者 Draguna VRABIE Frank LEWIS 《控制理论与应用(英文版)》 EI 2011年第3期353-360,共8页
This paper will present an approximate/adaptive dynamic programming(ADP) algorithm,that uses the idea of integral reinforcement learning(IRL),to determine online the Nash equilibrium solution for the two-player zerosu... This paper will present an approximate/adaptive dynamic programming(ADP) algorithm,that uses the idea of integral reinforcement learning(IRL),to determine online the Nash equilibrium solution for the two-player zerosum differential game with linear dynamics and infinite horizon quadratic cost.The algorithm is built around an iterative method that has been developed in the control engineering community for solving the continuous-time game algebraic Riccati equation(CT-GARE),which underlies the game problem.We here show how the ADP techniques will enhance the capabilities of the offline method allowing an online solution without the requirement of complete knowledge of the system dynamics.The feasibility of the ADP scheme is demonstrated in simulation for a power system control application.The adaptation goal is the best control policy that will face in an optimal manner the highest load disturbance. 展开更多
关键词 Approximate/Adaptive dynamic programming Game algebraic Riccati equation Zero-sum differential game Nash equilibrium
原文传递
Adaptive event-triggered distributed optimal guidance design via adaptive dynamic programming 被引量:4
17
作者 Teng LONG Yan CAO +1 位作者 Jingliang SUN Guangtong XU 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2022年第7期113-127,共15页
In this paper,the multi-missile cooperative guidance system is formulated as a general nonlinear multi-agent system.To save the limited communication resources,an adaptive eventtriggered optimal guidance law is propos... In this paper,the multi-missile cooperative guidance system is formulated as a general nonlinear multi-agent system.To save the limited communication resources,an adaptive eventtriggered optimal guidance law is proposed by designing a synchronization-error-driven triggering condition,which brings together the consensus control with Adaptive Dynamic Programming(ADP)technique.Then,the developed event-triggered distributed control law can be employed by finding an approximate solution of event-triggered coupled Hamilton-Jacobi-Bellman(HJB)equation.To address this issue,the critic network architecture is constructed,in which an adaptive weight updating law is designed for estimating the cooperative optimal cost function online.Therefore,the event-triggered closed-loop system is decomposed into two subsystems:the system with flow dynamics and the system with jump dynamics.By using Lyapunov method,the stability of this closed-loop system is guaranteed and all signals are ensured to be Uniformly Ultimately Bounded(UUB).Furthermore,the Zeno behavior is avoided.Simulation results are finally provided to demonstrate the effectiveness of the proposed method. 展开更多
关键词 Adaptive dynamic programming Distributed control Event-triggered Guidance and control Multi-agent system
原文传递
Data-Based Optimal Tracking of Autonomous Nonlinear Switching Systems 被引量:3
18
作者 Xiaofeng Li Lu Dong Changyin Sun 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第1期227-238,共12页
In this paper,a data-based scheme is proposed to solve the optimal tracking problem of autonomous nonlinear switching systems.The system state is forced to track the reference signal by minimizing the performance func... In this paper,a data-based scheme is proposed to solve the optimal tracking problem of autonomous nonlinear switching systems.The system state is forced to track the reference signal by minimizing the performance function.First,the problem is transformed to solve the corresponding Bellman optimality equation in terms of the Q-function(also named as action value function).Then,an iterative algorithm based on adaptive dynamic programming(ADP)is developed to find the optimal solution which is totally based on sampled data.The linear-in-parameter(LIP)neural network is taken as the value function approximator.Considering the presence of approximation error at each iteration step,the generated approximated value function sequence is proved to be boundedness around the exact optimal solution under some verifiable assumptions.Moreover,the effect that the learning process will be terminated after a finite number of iterations is investigated in this paper.A sufficient condition for asymptotically stability of the tracking error is derived.Finally,the effectiveness of the algorithm is demonstrated with three simulation examples. 展开更多
关键词 Adaptive dynamic programming approximation error data-based control Q-LEARNING switching system
下载PDF
Data⁃Based Feedback Relearning Algorithm for Robust Control of SGCMG Gimbal Servo System with Multi⁃source Disturbance 被引量:3
19
作者 ZHANG Yong MU Chaoxu LU Ming 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2021年第2期225-236,共12页
Single gimbal control moment gyroscope(SGCMG)with high precision and fast response is an important attitude control system for high precision docking,rapid maneuvering navigation and guidance system in the aerospace f... Single gimbal control moment gyroscope(SGCMG)with high precision and fast response is an important attitude control system for high precision docking,rapid maneuvering navigation and guidance system in the aerospace field.In this paper,considering the influence of multi-source disturbance,a data-based feedback relearning(FR)algorithm is designed for the robust control of SGCMG gimbal servo system.Based on adaptive dynamic programming and least-square principle,the FR algorithm is used to obtain the servo control strategy by collecting the online operation data of SGCMG system.This is a model-free learning strategy in which no prior knowledge of the SGCMG model is required.Then,combining the reinforcement learning mechanism,the servo control strategy is interacted with system dynamic of SGCMG.The adaptive evaluation and improvement of servo control strategy against the multi-source disturbance are realized.Meanwhile,a data redistribution method based on experience replay is designed to reduce data correlation to improve algorithm stability and data utilization efficiency.Finally,by comparing with other methods on the simulation model of SGCMG,the effectiveness of the proposed servo control strategy is verified. 展开更多
关键词 control moment gyroscope feedback relearning algorithm servo control reinforcement learning multisource disturbance adaptive dynamic programming
下载PDF
Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming 被引量:2
20
作者 Wei, Qinglai Liu, Derong 《控制理论与应用(英文版)》 EI 2011年第3期381-390,共10页
In this paper,we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming(ADP) approach.A new-optimal control... In this paper,we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming(ADP) approach.A new-optimal control algorithm based on the iterative ADP approach is proposed which makes the performance index function converge iteratively to the greatest lower bound of all performance indices within an error according to within finite time.The optimal number of control steps can also be obtained by the proposed-optimal control algorithm for the situation where the initial state of the system is unfixed.Neural networks are used to approximate the performance index function and compute the optimal control policy,respectively,for facilitating the implementation of the-optimal control algorithm.Finally,a simulation example is given to show the results of the proposed method. 展开更多
关键词 Adaptive dynamic programming Unfixed initial state Optimal control Finite time Neural networks
原文传递
上一页 1 2 3 下一页 到第
使用帮助 返回顶部