摘要
多智能体深度强化学习方法是深度强化学习方法在多智能体问题上的扩展,其中基于值函数分解的多智能体深度强化学习方法取得了较好的表现效果,是目前研究和应用的热点。文中介绍了基于值函数分解的多智能体深度强化学习方法的主要原理和框架;根据近期相关研究,总结出了提高混合网络拟合能力问题、提高收敛效果问题和提高算法可扩展性问题3个研究热点,从算法约束、环境复杂度、神经网络限制等方面分析了3个热点问题产生的原因;根据拟解决的问题和使用的方法对现有研究进行了分类梳理,总结了同类方法的共同点,分析了不同方法的优缺点;对基于值函数分解的多智能体深度强化学习方法在网络节点控制、无人编队控制两个热点领域的应用进行了阐述。
The multi-agent deep reinforcement learning is an extension of the deep reinforcement learning method to the multi-agents problem,in which the multi-agents deep reinforcement learning based on the value function factorization has achieved better performance and is a hotspot for research and application at present.This paper introduces the main principles and framework of the multi-agents deep reinforcement learning based on the value function factorization.Based on the recent related research,three research hotspots are summarized:the problem of improving the fitting ability of mixing network,the problem of improving the convergence effect and the problem of improving the scalability of algorithms,and the reasons for the three hotspot problems are analyzed in terms of algorithm constraints,environmental complexity and neural network limitations.The existing research is classified according to the problems to be solved and the methods to be used,the common points of similar methods are summarized,and the advantages and disadvantages of different methods are analyzed;the application of multi-agent deep reinforcement learning method based on value function decomposition in two hot fields of network node control and unmanned formation control is expounded.
作者
高玉钊
聂一鸣
GAO Yuzhao;NIE Yiming(National Innovation Institute of Defense Technology,Academic of Military Science,Beijing 100071,China)
出处
《计算机科学》
CSCD
北大核心
2024年第S01期22-30,共9页
Computer Science
关键词
多智能体深度强化学习
值函数分解
拟合能力
收敛效果
可扩展性
Multi-agent deep reinforcement learning
Value function factorization
Fitting ability
Convergence effect
Scalability