期刊文献+

基于Q学习的异构多智能体系统最优一致性

Optimal Consensus of Heterogeneous Multi-Agent Systems Based on Q-Learning
下载PDF
导出
摘要 对有领导者的异构离散多智能体系统的最优一致性问题,提出了一种无模型的基于非策略强化学习的控制协议设计方法。由于异构多智能体系统的状态矩阵不同,其局部邻居误差的动态表达式比较复杂。与现有的多智能体系统分布式控制方案相比,所提算法减少了计算的复杂性。首先,建立由增广变量构造的多智能体系统全局邻居误差动态表达式。其次,通过二次型形式的值函数得到耦合贝尔曼方程和Hamilton-Jacobi-Bellman(HJB)方程。再次,求解耦合HJB方程的最优解,得到多智能体最优一致性的纳什均衡解,并给出纳什均衡证明。从次,基于无模型的非策略Q学习算法,求解多智能体最优一致性的纳什均衡解。最后,利用批判神经网络结构,结合梯度下降法实现了所提出的算法,并通过仿真实例验证了算法的有效性。 This paper proposes a model-free control protocol design method based on off-policy reinforcement learning for solving the optimal consensus problem of heterogeneous multi-agent systems with leaders. The dynamic expression of local neighborhood error is complicated for the heterogeneous multi-agent systems because of its different system state matrices. Compared with the existing solution of designing observer for distributed control of multi-agent system, the method of solving global neighborhood error state expression proposed in this paper reduces the complexity of calculation. Firstly, the dynamic expression of global neighborhood error of multi-agent system constructed from augmented variables is established. Secondly, the coupled Bellman equation and HJB equation are obtained through the value function of quadratic form. Then, the Nash equilibrium solution of the multi-agent optimal consensus is obtained by solving the optimal solution of the coupled HJB equation, and the Nash equilibrium proof is given. Thirdly, an off-policy Q-learning algorithm is proposed to learn the Nash equilibrium solution of the multi-agent optimal consensus. Then, the proposed algorithm is implemented by using the critic neural network structure and gradient descent method. Finally, a simulation example is given to verify the effectiveness of the proposed algorithm.
作者 程薇燃 李金娜 Cheng Weiran;Li Jinna(School of Information and Control Engineering,Liaoning Petrochemical University,Fushun Liaoning 113001,China)
出处 《辽宁石油化工大学学报》 CAS 2022年第4期59-67,共9页 Journal of Liaoning Petrochemical University
基金 国家自然科学基金项目(62073158) 辽宁省重点领域联合开放基金项目(2019-KF-03-06) 辽宁省教育厅基本科研项目(LJKZ0401) 辽宁石油化工大学研究基金项目(2018XJJ-005)。
关键词 多智能体系统 神经网络 强化学习 最优一致性 Multi-agent system Neural network Reinforcement learning Optimal consensus
  • 相关文献

参考文献4

二级参考文献26

共引文献84

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部