期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:4
1
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 Adaptive dynamic programming(ADP) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
下载PDF
Guaranteed Cost Attitude Tracking Control for Uncertain Quadrotor Unmanned Aerial Vehicle Under Safety Constraints
2
作者 Qian Ma Peng Jin Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第6期1447-1457,共11页
In this paper,guaranteed cost attitude tracking con-trol for uncertain quadrotor unmanned aerial vehicle(QUAV)under safety constraints is studied.First,an augmented system is constructed by the tracking error system a... In this paper,guaranteed cost attitude tracking con-trol for uncertain quadrotor unmanned aerial vehicle(QUAV)under safety constraints is studied.First,an augmented system is constructed by the tracking error system and reference system.This transformation aims to convert the tracking control prob-lem into a stabilization control problem.Then,control barrier function and disturbance attenuation function are designed to characterize the violations of safety constraints and tolerance of uncertain disturbances,and they are incorporated into the reward function as penalty items.Based on the modified reward function,the problem is simplified as the optimal regulation problem of the nominal augmented system,and a new Hamilton-Jacobi-Bellman equation is developed.Finally,critic-only rein-forcement learning algorithm with a concurrent learning tech-nique is employed to solve the Hamilton-Jacobi-Bellman equa-tion and obtain the optimal controller.The proposed algorithm can not only ensure the reward function within an upper bound in the presence of uncertain disturbances,but also enforce safety constraints.The performance of the algorithm is evaluated by the numerical simulation. 展开更多
关键词 Attitude tracking control quadrotor unmanned aerial vehicle(QUAV) reinforcement learning safety constraints uncertain disturbances.
下载PDF
Adaptive Uniform Performance Control of Strict-Feedback Nonlinear Systems With Time-Varying Control Gain 被引量:2
3
作者 Kai Zhao Changyun Wen +1 位作者 Yongduan Song Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第2期451-461,共11页
In this paper,we present a novel adaptive performance control approach for strict-feedback nonparametric systems with unknown time-varying control coefficients,which mainly includes the following steps.Firstly,by intr... In this paper,we present a novel adaptive performance control approach for strict-feedback nonparametric systems with unknown time-varying control coefficients,which mainly includes the following steps.Firstly,by introducing several key transformation functions and selecting the initial value of the time-varying scaling function,the symmetric prescribed performance with global and semi-global properties can be handled uniformly,without the need for control re-design.Secondly,to handle the problem of unknown time-varying control coefficient with an unknown sign,we propose an enhanced Nussbaum function(ENF)bearing some unique properties and characteristics,with which the complex stability analysis based on specific Nussbaum functions as commonly used is no longer required.Thirdly,by utilizing the core-function information technique,the nonparametric uncertainties in the system are gracefully handled so that no approximator is required.Furthermore,simulation results verify the effectiveness and benefits of the approach. 展开更多
关键词 Adaptive control enhanced Nussbaum function(ENF) strict-feedback systems unified prescribed performance
下载PDF
Practical prescribed-time tracking control for uncertain strict-feedback systems with guaranteed performance under unknown control directions 被引量:1
4
作者 Zhou Yang Yujuan Wang Frank L.Lewis 《Journal of Automation and Intelligence》 2023年第2期99-104,共6页
In this paper,we consider the practical prescribed-time performance guaranteed tracking control problem for a class of uncertain strict-feedback systems subject to unknown control direction.Due to the existence of unk... In this paper,we consider the practical prescribed-time performance guaranteed tracking control problem for a class of uncertain strict-feedback systems subject to unknown control direction.Due to the existence of unknown nonlinearities and uncertainties,it is challenging to design a controller that can ensure the stability of closed-loop system within a predetermined finite time while maintaining the specified transient performance.The underlying problem becomes further complex as the control directions are unknown.To deal with the above problems,a special translation function as well as Nussbaum type function are introduced in the prescribed performance control(PPC)framework.Finally,a PPC as well as preset finite time tracking control scheme is designed,and its effectiveness is confirmed by both theoretical analysis and numerical simulation. 展开更多
关键词 Strict-feedback systems Practical prescribed-time control Prescribed performance control Unknown control direction
下载PDF
Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation
5
作者 QIN Yahang ZHANG Chengye +2 位作者 CHEN Ci XIE Shengli LEWIS Frank L 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2024年第1期114-135,共22页
This paper presents a learning-based control policy design for point-to-point vehicle positioning in the urban environment via BeiDou navigation.While navigating in urban canyons,the multipath effect is a kind of inte... This paper presents a learning-based control policy design for point-to-point vehicle positioning in the urban environment via BeiDou navigation.While navigating in urban canyons,the multipath effect is a kind of interference that causes the navigation signal to drift and thus imposes severe impacts on vehicle localization due to the reflection and diffraction of the BeiDou signal.Here,the authors formulated the navigation control system with unknown vehicle dynamics into an optimal control-seeking problem through a linear discrete-time system,and the point-to-point localization control is modeled and handled by leveraging off-policy reinforcement learning for feedback control.The proposed learning-based design guarantees optimality with prescribed performance and also stabilizes the closed-loop navigation system,without the full knowledge of the vehicle dynamics.It is seen that the proposed method can withstand the impact of the multipath effect while satisfying the prescribed convergence rate.A case study demonstrates that the proposed algorithms effectively drive the vehicle to a desired setpoint under the multipath effect introduced by actual experiments of BeiDou navigation in the urban environment. 展开更多
关键词 BeiDou navigation multipath effect prescribed convergence rate reinforcement learning urban localization.
原文传递
Learning the continuous-time optimal decision law from discrete-time rewards
6
作者 Ci Chen Lihua Xie +3 位作者 Kan Xie Frank Leroy Lewis Yilu Liu Shengli Xie 《National Science Open》 2024年第5期130-147,共18页
The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's ... The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision. 展开更多
关键词 continuous-time state and action decision law learning discrete-time reward dynamical systems reinforcement learning
原文传递
Discrete-time dynamic graphical games:model-free reinforcement learning solution 被引量:6
7
作者 Mohammed I.ABOUHEAF Frank L.LEWIS +1 位作者 Magdi S.MAHMOUD Dariusz G.MIKULSKI 《Control Theory and Technology》 EI CSCD 2015年第1期55-69,共15页
This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize t... This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize to the state of a command multi-agent dynamical systems, where pinning control is used generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents' dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time. 展开更多
关键词 Dynamic graphical games Nash equilibrium discrete mechanics optimal control model-free reinforcementlearning policy iteration
原文传递
Consensus controller for multi-UAV navigation 被引量:3
8
作者 Patrik KOLARIC Ci CHEN +1 位作者 Ankur DALAL Frank L. LEWIS 《Control Theory and Technology》 EI CSCD 2018年第2期110-121,共12页
In this paper, we design consensus algorithms for multiple unmanned aerial vehicles (UAV). We mainly focus on the control design in the face of measurement noise and propose a position consensus controller based on ... In this paper, we design consensus algorithms for multiple unmanned aerial vehicles (UAV). We mainly focus on the control design in the face of measurement noise and propose a position consensus controller based on the sliding mode control by using the distributed UAV information. Within the framework of Lyapunov theory, it is shown that all signals in the closed-loop multi- UAV systems are stabilized by the proposed algorithm, while consensus errors are uniformly ultimately bounded. Moreover, for each local UAV, we propose a mechanism to define the trustworthiness, based on which the edge weights are tuned to eliminate negative influence from stubborn agents or agents exposed to extremely noisy measurement. Finally, we develop software for a nano UAV platform, based on which we implement our algorithms to address measurement noises in UAV flight tests. The experimental results validate the effectiveness of the proposed algorithms. 展开更多
关键词 Consensus control multi-agent system QUADROTOR Lyapunov stability distributed system
原文传递
Heterogeneous multi-player imitation learning
9
作者 Bosen Lian Wenqian Xue Frank L.Lewis 《Control Theory and Technology》 EI CSCD 2023年第3期281-291,共11页
This paper studies imitation learning in nonlinear multi-player game systems with heterogeneous control input dynamics.We propose a model-free data-driven inverse reinforcement learning(RL)algorithm for a leaner to fi... This paper studies imitation learning in nonlinear multi-player game systems with heterogeneous control input dynamics.We propose a model-free data-driven inverse reinforcement learning(RL)algorithm for a leaner to find the cost functions of a N-player Nash expert system given the expert's states and control inputs.This allows us to address the imitation learning problem without prior knowledge of the expert's system dynamics.To achieve this,we provide a basic model-based algorithm that is built upon RL and inverse optimal control.This serves as the foundation for our final model-free inverse RL algorithm which is implemented via neural network-based value function approximators.Theoretical analysis and simulation examples verify the methods. 展开更多
关键词 Imitation learning Inverse reinforcement learning Heterogeneous multi-player games Data-driven model-free control
原文传递
Call for papers Special issue on Learning and control in cooperative multi-agent systems
10
作者 Frank L. Lewis Zhong-Ping Jiang Tengfei Liu 《Control Theory and Technology》 EI CSCD 2014年第2期215-216,共2页
Cooperative control of multi-agent systems linked by communication networks is a well-developed and still growing field. The interplay of the individual agent dynamics and the communication graph topology results in i... Cooperative control of multi-agent systems linked by communication networks is a well-developed and still growing field. The interplay of the individual agent dynamics and the communication graph topology results in intriguing and often surprising behaviors that are not manifested in the study of control systems for single-agent dynamics. This field brings systems theory, feedback control, graph theory, communication systems, complex systems theory to provide rigorous analysis and design for multiple dynamical systems interconnected by a graph information flow structure. Applications have been made to vehicle formation control, coordinated multi-satellite control, electric power system control, robotics, autonomous airborne systems, manufacturing production lines, and the synchronization of dynamical processes in chemistry, physics, biology, and chaotic systems. 展开更多
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部