期刊文献+

An Online Q-Learning Method for Linear-Quadratic Nonzero-Sum Stochastic Differential Games with Completely Unknown Dynamics

原文传递
导出
摘要 In this paper,the authors design a reinforcement learning algorithm to solve the adaptive linear-quadratic stochastic n-players non-zero sum differential game with completely unknown dynamics.For each player,a critic network is used to estimate the Q-function,and an actor network is used to estimate the control input.A model-free online Q-learning algorithm is obtained for solving this kind of problems.It is proved that under some mild conditions the system state and weight estimation errors can be uniformly ultimately bounded.A simulation with five players is given to verify the effectiveness of the algorithm.
出处 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2024年第5期1907-1922,共16页 系统科学与复杂性学报(英文版)
基金 supported in part by the National Natural Science Foundation of China under Grant Nos.62122043,62192753 in part by Natural Science Foundation of Shandong Province for Distinguished Young Scholars under Grant No.ZR2022JQ31 in part by the Innovative Research Groups of the National Natural Science Foundation of China under Grant No.61821004.
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部