期刊文献+
共找到292篇文章
< 1 2 15 >
每页显示 20 50 100
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:4
1
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 adaptive dynamic programming(adp) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
下载PDF
Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming 被引量:1
2
作者 Zhongyang Wang Youqing Wang Zdzisław Kowalczuk 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期131-140,共10页
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho... In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection. 展开更多
关键词 adaptive dynamic programming(adp) internal model principle(IMP) output feedback problem policy iteration(PI) value iteration(VI)
下载PDF
基于有限时间ADP的微波加热高钛渣温度跟踪控制 被引量:1
3
作者 杨彪 杜婉 +3 位作者 李鑫培 高皓 刘承 马红涛 《控制工程》 CSCD 北大核心 2024年第2期193-202,共10页
针对常规控制方法对微波加热过程控制效果不够理想的问题,提出一种基于数据驱动模型的有限时间自适应动态规划微波加热温度跟踪算法。算法包含模型网络、评价网络和执行网络,这3个网络的实现依赖于神经网络。模型网络实现微波加热过程... 针对常规控制方法对微波加热过程控制效果不够理想的问题,提出一种基于数据驱动模型的有限时间自适应动态规划微波加热温度跟踪算法。算法包含模型网络、评价网络和执行网络,这3个网络的实现依赖于神经网络。模型网络实现微波加热过程的数据驱动建模,评价网络和执行网络实现最优性能指标函数和控制功率的逼近。最后将温度跟踪转化为误差的镇定。通过理论推导证明了算法的收敛性及最优性,并进一步开展了微波加热高钛渣温度跟踪实验和仿真研究。结果表明,算法能有效地跟踪高钛渣的加热过程,基于ELMAN神经网络的模型预测误差小于1℃,温度跟踪误差小于0.2℃,在工业微波加热中具有潜在的应用价值。 展开更多
关键词 微波加热 高钛渣 有限时间 自适应动态规划 神经网络
下载PDF
Parallel Control for Optimal Tracking via Adaptive Dynamic Programming 被引量:23
4
作者 Jingwei Lu Qinglai Wei Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第6期1662-1674,共13页
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int... This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases. 展开更多
关键词 adaptive dynamic programming(adp) nonlinear optimal control parallel controller parallel control theory parallel system tracking control neural network(NN)
下载PDF
Residential Energy Scheduling for Variable Weather Solar Energy Based on Adaptive Dynamic Programming 被引量:15
5
作者 Derong Liu Yancai Xu +1 位作者 Qinglai Wei Xinliang Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期36-46,共11页
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener... The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost. 展开更多
关键词 Action dependent heuristic dynamic programming adaptive dynamic programming control strategy residential energy management smart grid
下载PDF
Optimal Control for a Class of Complex Singular System Based on Adaptive Dynamic Programming 被引量:6
6
作者 Zhan Shi Zhanshan Wang 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2019年第1期188-197,共10页
This paper presents a new design approach to achieve decentralized optimal control of high-dimension complex singular systems with dynamic uncertainties. Based on robust adaptive dynamic programming(robust ADP) method... This paper presents a new design approach to achieve decentralized optimal control of high-dimension complex singular systems with dynamic uncertainties. Based on robust adaptive dynamic programming(robust ADP) method, controllers for solving the singular systems optimal control problem are designed. The proposed algorithm can work well when the system model is not exactly known but the input and output data can be measured. The policy iteration of each controller only uses their own states and input information for learning,and do not need to know the whole system dynamics. Simulation results on the New England 10-machine 39-bus test system show the effectiveness of the designed controller. 展开更多
关键词 adaptive dynamic programming (adp) DECENTRALIZED CONTROL frequency CONTROL power system SINGULAR systems
下载PDF
Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming 被引量:16
7
作者 Qinglai Wei Derong Liu +1 位作者 Yu Liu Ruizhuo Song 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第2期168-176,共9页
This paper concerns a novel optimal self-learning battery sequential control scheme for smart home energy systems. The main idea is to use the adaptive dynamic programming U+0028 ADP U+0029 technique to obtain the opt... This paper concerns a novel optimal self-learning battery sequential control scheme for smart home energy systems. The main idea is to use the adaptive dynamic programming U+0028 ADP U+0029 technique to obtain the optimal battery sequential control iteratively. First, the battery energy management system model is established, where the power efficiency of the battery is considered. Next, considering the power constraints of the battery, a new non-quadratic form performance index function is established, which guarantees that the value of the iterative control law cannot exceed the maximum charging/discharging power of the battery to extend the service life of the battery. Then, the convergence properties of the iterative ADP algorithm are analyzed, which guarantees that the iterative value function and the iterative control law both reach the optimums. Finally, simulation and comparison results are given to illustrate the performance of the presented method. © 2017 Chinese Association of Automation. 展开更多
关键词 adaptive control systems Automation Battery management systems Control theory Electric batteries Energy management Energy management systems Intelligent buildings Iterative methods Number theory Secondary batteries
下载PDF
An Optimal Control Scheme for a Class of Discrete-time Nonlinear Systems with Time Delays Using Adaptive Dynamic Programming 被引量:17
8
作者 WEI Qing-Lai ZHANG Hua-Guang +1 位作者 LIU De-Rong ZHAO Yan 《自动化学报》 EI CSCD 北大核心 2010年第1期121-129,共9页
关键词 非线性系统 最优控制 控制变量 动态规划
下载PDF
Policy iteration optimal tracking control for chaotic systems by using an adaptive dynamic programming approach 被引量:2
9
作者 魏庆来 刘德荣 徐延才 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第3期87-94,共8页
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob... A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming
下载PDF
Hierarchical adaptive stereo matching algorithm for obstacle detection with dynamic programming 被引量:1
10
作者 Ming BAI Yan ZHUANG Wei WANG 《控制理论与应用(英文版)》 EI 2009年第1期41-47,共7页
An adaptive weighted stereo matching algorithm with multilevel and bidirectional dynamic programming based on ground control points (GCPs) is presented. To decrease time complexity without losing matching precision,... An adaptive weighted stereo matching algorithm with multilevel and bidirectional dynamic programming based on ground control points (GCPs) is presented. To decrease time complexity without losing matching precision, using a multilevel search scheme, the coarse matching is processed in typical disparity space image, while the fine matching is processed in disparity-offset space image. In the upper level, GCPs are obtained by enhanced volumetric iterative algorithm enforcing the mutual constraint and the threshold constraint. Under the supervision of the highly reliable GCPs, bidirectional dynamic programming framework is employed to solve the inconsistency in the optimization path. In the lower level, to reduce running time, disparity-offset space is proposed to efficiently achieve the dense disparity image. In addition, an adaptive dual support-weight strategy is presented to aggregate matching cost, which considers photometric and geometric information. Further, post-processing algorithm can ameliorate disparity results in areas with depth discontinuities and related by occlusions using dual threshold algorithm, where missing stereo information is substituted from surrounding regions. To demonstrate the effectiveness of the algorithm, we present the two groups of experimental results for four widely used standard stereo data sets, including discussion on performance and comparison with other methods, which show that the algorithm has not only a fast speed, but also significantly improves the efficiency of holistic optimization. 展开更多
关键词 Stereo matching Ground control points adaptive weighted aggregation Bidirectional dynamic programming Obstacle detection based on stereo vision
下载PDF
Event-based performance guaranteed tracking control for constrained nonlinear system via adaptive dynamic programming method
11
作者 Xingyi Zhang Zijie Guo +1 位作者 Hongru Ren Hongyi Li 《Journal of Automation and Intelligence》 2023年第4期239-247,共9页
An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic progra... An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme. 展开更多
关键词 adaptive dynamic programming(adp) Asymmetric input constraints Prescribed performance control Event-triggered control Optimal tracking control
下载PDF
Extended Sequential Truncation Technique for Adaptive Dynamic Programming Based Security-Constrained Unit Commitment with Optimal Power Flow Constraints
12
作者 Danli Long Hua Wei 《Journal of Power and Energy Engineering》 2014年第4期687-693,共7页
Considering the economics and securities for the operation of a power system, this paper presents a new adaptive dynamic programming approach for security-constrained unit commitment (SCUC) problems. In response to t... Considering the economics and securities for the operation of a power system, this paper presents a new adaptive dynamic programming approach for security-constrained unit commitment (SCUC) problems. In response to the “curse of dimension” problem of dynamic programming, the approach solves the Bellman’s equation of SCUC approximately by solving a sequence of simplified single stage optimization problems. An extended sequential truncation technique is proposed to explore the state space of the approach, which is superior to traditional sequential truncation in daily cost for unit commitment. Different test cases from 30 to 300 buses over a 24 h horizon are analyzed. Extensive numerical comparisons show that the proposed approach is capable of obtaining the optimal unit commitment schedules without any network and bus voltage violations, and minimizing the operation cost as well. 展开更多
关键词 Power System Operation and Planning PRIORITY Order adaptive Dynamic programming Unit COMMITMENT
下载PDF
Value Iteration-Based Cooperative Adaptive Optimal Control for Multi-Player Differential Games With Incomplete Information
13
作者 Yun Zhang Lulu Zhang Yunze Cai 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第3期690-697,共8页
This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the l... This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the learning process and adapt their policies sequentially.Our method removes the dependence of admissible initial policies,which is one of the main drawbacks of the PI-based frameworks.Furthermore,this algorithm enables the players to adapt their control policies without full knowledge of others’ system parameters or control laws.The efficacy of our method is illustrated by three examples. 展开更多
关键词 adaptive dynamic programming incomplete information multi-player differential game value iteration
下载PDF
基于ELM的水泥立磨生料细度ADP控制 被引量:6
14
作者 林小峰 孔伟凯 《系统仿真学报》 CAS CSCD 北大核心 2016年第11期2764-2770,共7页
水泥生产中的立磨粉磨过程具有非线性、强耦合、大滞后等特点,对其进行精确的建模和实现生料细度的控制比较困难。提出一种基于极限学习机(ELM,extreme learning machine)的自适应动态规划(ADP,adaptive dynamic programming)优化控制... 水泥生产中的立磨粉磨过程具有非线性、强耦合、大滞后等特点,对其进行精确的建模和实现生料细度的控制比较困难。提出一种基于极限学习机(ELM,extreme learning machine)的自适应动态规划(ADP,adaptive dynamic programming)优化控制算法。采用极限学习机建立立磨生料粉磨过程的生料细度预测模型,将其作为ADP算法中的模型网络,并以在线序列极限学习机实现ADP的执行网络和评价网络。结果表明:在仿真意义上,所提算法能够对生料细度进行有效地控制,对立磨稳定生产,降低该生产过程的能耗具有一定理论指导意义。 展开更多
关键词 水泥立磨 生料 自适应动态规划 极限学习机
下载PDF
基于ADP的高超声速飞行器非线性最优控制 被引量:3
15
作者 郭超 梁晓庚 王斐 《火力与指挥控制》 CSCD 北大核心 2014年第6期77-81,共5页
针对高超声速飞行器提出了一种基于自适应动态规划(ADP)的非线性最优控制器设计方法。首先,非线性系统的最优控制器设计问题等价于求解HJB方程。传统的HJB方程求解方法需要系统动态的精确知识,但是实际中系统动态常常是未知的或者部分... 针对高超声速飞行器提出了一种基于自适应动态规划(ADP)的非线性最优控制器设计方法。首先,非线性系统的最优控制器设计问题等价于求解HJB方程。传统的HJB方程求解方法需要系统动态的精确知识,但是实际中系统动态常常是未知的或者部分未知的。针对高超声速飞行器系统动态未知的情况,采用ADP算法在线求解HJB方程,设计非线性最优控制器。该设计方法的一个显著优点是不需要系统的内部动态知识。最后,仿真结果验证了所设计控制器的有效性。 展开更多
关键词 高超声速飞行器 自适应动态规划 非线性最优控制 HJB
下载PDF
天线罩误差下基于ADP的机动目标拦截制导策略 被引量:2
16
作者 郭建国 胡冠杰 +1 位作者 郭宗易 王国庆 《宇航学报》 EI CAS CSCD 北大核心 2022年第7期911-920,共10页
针对导弹导引头存在的天线罩误差,提出了一种基于自适应动态规划(ADP)的制导策略。不同于传统处理天线罩误差的估计与补偿方式,避免了估计过程中产生的误差影响。在导弹拦截机动目标的场景下,将拦截问题转化为鲁棒最优控制问题。设计了... 针对导弹导引头存在的天线罩误差,提出了一种基于自适应动态规划(ADP)的制导策略。不同于传统处理天线罩误差的估计与补偿方式,避免了估计过程中产生的误差影响。在导弹拦截机动目标的场景下,将拦截问题转化为鲁棒最优控制问题。设计了一种既可以消除天线罩误差和目标机动影响,又可以保证控制能量最小的代价函数。通过构造评价网络,利用自适应动态规划来求解近似鲁棒最优制导策略,并附加鲁棒控制项得到最终的机动目标拦截制导策略。采用李雅普诺夫稳定性理论证明了权值误差的一致最终有界和闭环系统的渐近稳定。仿真结果验证了所提出制导策略对天线罩误差下拦截机动目标的有效性。 展开更多
关键词 天线罩误差 自适应动态规划 制导策略 鲁棒最优控制 李雅普诺夫稳定性
下载PDF
基于ADP的一类时滞离散系统跟踪控制 被引量:1
17
作者 林小峰 杨晓娜 +1 位作者 黄清宝 宋春宁 《广西大学学报(自然科学版)》 CAS CSCD 北大核心 2011年第6期994-999,共6页
时滞现象是自然界中广泛存在的一种物理现象,时滞的存在使得被控量不能及时反映系统的变化,从而使控制系统的稳定性变差,给时滞系统控制器的设计带来很大困难。针对一类状态和控制输入均含有时滞的离散仿射系统的跟踪控制进行研究,采用... 时滞现象是自然界中广泛存在的一种物理现象,时滞的存在使得被控量不能及时反映系统的变化,从而使控制系统的稳定性变差,给时滞系统控制器的设计带来很大困难。针对一类状态和控制输入均含有时滞的离散仿射系统的跟踪控制进行研究,采用自适应动态规划迭代算法求解时滞系统的跟踪控制,在自适应动态规划的基础上,建立系统性能指标函数,通过系统变换将跟踪问题转化成为最优调节问题,并采用自适应动态规划迭代算法对性能指标函数进行迭代求解,得到最优控制策略。并给出了一个仿真算例,结果证明了所提出的跟踪控制方案是有效的。 展开更多
关键词 时滞 跟踪 迭代 离散非线性系统 自适应动态规划
下载PDF
基于策略迭代ADP的碳纤维角联织机张力控制
18
作者 刘薇 张黎 李想 《天津工业大学学报》 CAS 北大核心 2023年第1期72-80,共9页
针对碳纤维角联织机经纱张力控制问题,考虑开口等不确定因素对经纱张力的影响,建立了离散非线性送经系统张力控制模型,提出了策略迭代自适应动态规划(ADP),并对ADP中评价网络设计了自适应权值更新率;证明了策略迭代ADP在离散系统的收敛... 针对碳纤维角联织机经纱张力控制问题,考虑开口等不确定因素对经纱张力的影响,建立了离散非线性送经系统张力控制模型,提出了策略迭代自适应动态规划(ADP),并对ADP中评价网络设计了自适应权值更新率;证明了策略迭代ADP在离散系统的收敛性,削减了非线性及不确定因素对经纱张力的影响,实现了对经纱张力的稳定控制,提高了系统鲁棒性。仿真结果表明:相比传统ADP,策略迭代ADP可以使经纱张力在2 s内快速无波动的到达稳定状态,使系统性能指标函数收敛更优。 展开更多
关键词 碳纤维角联织机 送经系统 策略迭代adp 自适应权值更新率
下载PDF
DP-ADPSO算法在机组负荷优化组合分配问题中的应用
19
作者 闫旺 李郁侠 +3 位作者 师彪 孟欣 李鹏 牛艳利 《沈阳农业大学学报》 CAS CSCD 北大核心 2010年第1期64-68,共5页
针对离散粒子群应用于机组负荷优化问题中存在早熟收敛的难题,提出了动态规划-自适应离散粒子群算法求解机组负荷优化组合问题。该方法首先保证所有随机生成的粒子均为满足基本约束条件的可行解,使整个算法只在可行解区域内进行动态优... 针对离散粒子群应用于机组负荷优化问题中存在早熟收敛的难题,提出了动态规划-自适应离散粒子群算法求解机组负荷优化组合问题。该方法首先保证所有随机生成的粒子均为满足基本约束条件的可行解,使整个算法只在可行解区域内进行动态优化搜索,缩短了计算时间。计算实例表明:动态规划-自适应离散粒子群算法能较好地收敛到最优解,而且该方法得出的解具有精度高、收敛速度快的优点,应用效果优于动态规划法和离散粒子群算法,说明该方法是有效的、合理的,具有较好的应用前景。 展开更多
关键词 离散粒子群算法 动态规划-自适应离散粒子群算法 机组优化组合 负荷分配 全局最优解
下载PDF
Discounted Iterative Adaptive Critic Designs With Novel Stability Analysis for Tracking Control 被引量:9
20
作者 Mingming Ha Ding Wang Derong Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第7期1262-1272,共11页
The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of t... The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of time steps increases.In this paper,a new cost function is introduced to develop the value-iteration-based adaptive critic framework to solve the tracking control problem.Unlike the regulator problem,the iterative value function of tracking control problem cannot be regarded as a Lyapunov function.A novel stability analysis method is developed to guarantee that the tracking error converges to zero.The discounted iterative scheme under the new cost function for the special case of linear systems is elaborated.Finally,the tracking performance of the present scheme is demonstrated by numerical results and compared with those of the traditional approaches. 展开更多
关键词 adaptive critic design adaptive dynamic programming(adp) approximate dynamic programming discrete-time nonlinear systems reinforcement learning stability analysis tracking control value iteration(VI)
下载PDF
上一页 1 2 15 下一页 到第
使用帮助 返回顶部