非零和微分博弈系统的事件触发最优跟踪控制

Event-triggered optimal tracking control fornonzero-sum differential game systems

下载PDF

导出

摘要近年来,对于具有未知动态的非零和微分博弈系统的跟踪问题,已经得到了讨论,然而这些方法是时间触发的,在传输带宽和计算资源有限的环境下并不适用.针对具有未知动态的连续时间非线性非零和微分博弈系统,本文提出了一种基于积分强化学习的事件触发自适应动态规划方法.该策略受梯度下降法和经验重放技术的启发,利用历史和当前数据更新神经网络权值.该方法提高了神经网络权值的收敛速度,消除了一般文献设计中常用的初始容许控制假设.同时,该算法提出了一种易于在线检查的持续激励条件(通常称为PE),避免了传统的不容易检查的持续激励条件.基于李亚普诺夫理论,证明了跟踪误差和评价神经网络估计误差的一致最终有界性.最后,通过一个数值仿真实例验证了该方法的可行性. Recently,for the tracking problem of nonzero-sum differential game systems with unknown dynamics,it has been discussed that these methods are time-triggered,which is not ideal in an environment with limited transmission bandwidth and computing resources.In this paper,an integral reinforcement learning based event-triggered adaptive dynamic programming scheme is developed for continuous-time nonlinear nonzero-sum differential game systems with unknown dynamics.The strategy is inspired by the gradient descent method and the experience replay technique and uses the historical and current data to update the neural network weight.This method can improve the convergence speed of neural network weight and remove the assumption of initial admissible control often used in general literature design.In the meantime,the algorithm proposes a persistent excitation condition(commonly called PE)that is easy to check online,which avoids the traditional PE condition that is not easy to check.Based on the Lyapunov theory,the uniform ultimate boundedness(UUB)properties of the tracking error and the critic neural network estimation error have been proved.Finally,a numerical simulation example is given to verify the feasibility of the proposed method.

作者石义博王朝立 SHI Yi-bo;WANG Chao-li(College of Science,University of Shanghai for Science and Technology,Shanghai 200093,China;School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)

机构地区上海理工大学理学院上海理工大学光电信息与计算机工程学院

出处《控制理论与应用》 EI CAS CSCD 北大核心 2023年第2期220-230,共11页 Control Theory & Applications

基金 Supported by the National Defense Basic Research Program(JCKY2019413D001) the Natural Science Foundation(6217023627,62003214,62173054) the Shanghai Natural Science Foundation(19ZR1436000)。

关键词非零和博弈积分强化学习最优跟踪控制神经网络事件触发 nonzero-sum games integral reinforcement learning optimal tracking control neural network eventtriggered

分类号 TP13 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1王鼎,胡凌治,赵明明,哈明鸣,乔俊飞.未知非线性零和博弈最优跟踪的事件触发控制设计[J].自动化学报,2023,49(1):91-101.
2曹玉涛,徐宁.论人类命运共同体的全人类共同价值根基[J].新华文摘,2022(23):40-43.
3刘辉.基于在线附加Q学习的伺服电机速度最优跟踪控制方法研究[J].中文科技期刊数据库（文摘版）工程技术,2022(1):113-116.
4蒋毅躜,高国琴,方志明.基于时延估计的不确定混联机构自适应惯性增益滑模控制方法[J].软件导刊,2022,21(10):205-211.
5杨杰,彭壮壮,王世杰,马聪,王龙,段国林.陶瓷浆料3D打印机挤压力模糊神经网络PID稳定控制研究[J].工程设计学报,2023,30(1):65-72. 被引量：1
6陈辉,武文豪,秦春斌.不确定离散时间系统的复合非线性反馈积分滑模控制[J].控制理论与应用,2023,40(2):297-303.
7龙宇飞,沈枫,黎津伶,金昕.病案首页主要诊断智能化质量控制模型的设计与实施[J].中文科技期刊数据库（引文版）医药卫生,2022(12):140-143.
8任志刚,吴宗泽,谢胜利.基于控制参数化的注塑工业过程最优反馈控制方法[J].控制理论与应用,2022,39(11):2125-2136. 被引量：1
9JFE钢铁公司开发纹理分析型钢板表面检查装置[J].太钢科技,2022(4):61-61.
10牛鑫淼,孟庆光.体育产品制造利益相关方行为策略选择机理研究[J].科技和产业,2023,23(5):147-152. 被引量：1

控制理论与应用

2023年第2期

浏览历史

内容加载中请稍等...

非零和微分博弈系统的事件触发最优跟踪控制

相关作者

相关机构

相关主题

浏览历史