期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Discounted Iterative Adaptive Critic Designs With Novel Stability Analysis for Tracking Control 被引量:7
1
作者 Mingming Ha Ding Wang Derong Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第7期1262-1272,共11页
The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of t... The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of time steps increases.In this paper,a new cost function is introduced to develop the value-iteration-based adaptive critic framework to solve the tracking control problem.Unlike the regulator problem,the iterative value function of tracking control problem cannot be regarded as a Lyapunov function.A novel stability analysis method is developed to guarantee that the tracking error converges to zero.The discounted iterative scheme under the new cost function for the special case of linear systems is elaborated.Finally,the tracking performance of the present scheme is demonstrated by numerical results and compared with those of the traditional approaches. 展开更多
关键词 adaptive critic design adaptive dynamic programming(ADP) approximate dynamic programming discrete-time nonlinear systems reinforcement learning stability analysis tracking control value iteration(VI)
下载PDF
Policy iteration optimal tracking control for chaotic systems by using an adaptive dynamic programming approach 被引量:1
2
作者 魏庆来 刘德荣 徐延才 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第3期87-94,共8页
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob... A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming
下载PDF
Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems 被引量:1
3
作者 Guangyu Zhu Xiaolu Li +2 位作者 Ranran Sun Yiyuan Yang Peng Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第3期781-791,共11页
Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iterati... Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iteration(DTTV)algorithm,is developed.The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance.The admissibility of the iterative control law is analyzed.The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution.To implement the algorithm,neural networks are employed and a new implementation structure is established,which avoids solving the generalized Bellman equation in each iteration.Finally,the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm,where the mass and pendulum bar length are permitted to be time-varying parameters.The effectiveness of the developed method is illustrated by numerical results and comparisons. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming optimal control policy iteration TIME-VARYING
下载PDF
Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints 被引量:6
4
作者 Xiong Yang Bo Zhao 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2020年第2期575-583,共9页
In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in ord... In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints.Then,we develop a Hamilton-Jacobi-Bellman equation(HJBE),which arises in the discounted cost optimal control problem.To obtain the optimal neurocontroller,we utilize a critic neural network(CNN)to solve the HJBE under the framework of reinforcement learning.The CNN's weight vector is tuned via the gradient descent approach.Based on the Lyapunov method,we prove that uniform ultimate boundedness of the CNN's weight vector and the closed-loop system is guaranteed.Finally,we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples. 展开更多
关键词 adaptive critic designs(ACDs) asymmetric input constraint critic neural network(CNN) nonlinear systems optimal control reinforcement learning(RL)
下载PDF
A new approach of optimal control for a class of continuous-time chaotic systems by an online ADP algorithm
5
作者 宋睿卓 肖文栋 魏庆来 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第5期138-144,共7页
We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the perfo... We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the performance index function reach an optimum. The expression of the performance index function for the chaotic system is first presented. The online ADP algorithm is presented to achieve optimal control. In the ADP structure, neural networks are used to construct a critic network and an action network, which can obtain an approximate performance index function and the control input, respectively. It is proven that the critic parameter error dynamics and the closed-loop chaotic systems are uniformly ultimately bounded exponentially. Our simulation results illustrate the performance of the established optimal control method. 展开更多
关键词 adaptive dynamic programming adaptive critic designs optimal control continuous-time chaoticsystem
下载PDF
State of the Art of Adaptive Dynamic Programming and Reinforcement Learning
6
作者 Derong Liu Mingming Ha Shan Xue 《CAAI Artificial Intelligence Research》 2022年第2期93-110,共18页
This article introduces the state-of-the-art development of adaptive dynamic programming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamic progra... This article introduces the state-of-the-art development of adaptive dynamic programming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamic programming are illustrated.Adaptive dynamic programming(ADP)is then introduced following a brief discussion of dynamic programming.Researchers in ADP and RL have enjoyed the fast developments of the past decade from algorithms,to convergence and optimality analyses,and to stability results.Several key steps in the recent theoretical developments of ADPRL are mentioned with some future perspectives.In particular,convergence and optimality results of value iteration and policy iteration are reviewed,followed by an introduction to the most recent results on stability analysis of value iteration algorithms. 展开更多
关键词 adaptive dynamic programming approximate dynamic programming adaptive critic designs neuro-dynamic programming neural dynamic programming reinforcement learning intelligent control learning control optimal control
原文传递
A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems 被引量:8
7
作者 WEI QingLai LIU DeRong 《Science China Chemistry》 SCIE EI CAS CSCD 2015年第12期143-157,共15页
In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic no... In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic nonlinear systems. The idea is to use an iterative adaptive dynamic programming(ADP) technique to construct the iterative control law which optimizes the iterative Q function. When the optimal Q function is obtained, the optimal control law can be achieved by directly minimizing the optimal Q function, where the mathematical model of the system is not necessary. Convergence property is analyzed to show that the iterative Q function is monotonically non-increasing and converges to the solution of the optimality equation. It is also proven that any of the iterative control laws is a stable control law. Neural networks are employed to implement the policy iteration based deterministic Q-learning algorithm, by approximating the iterative Q function and the iterative control law, respectively. Finally, two simulation examples are presented to illustrate the performance of the developed algorithm. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming Q-LEARNING policy iteration neural networks nonlinear systems optimal control
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部