期刊文献+

基于多智能体深度强化学习的无人机动态预部署策略 被引量:3

A Dynamic Pre-Deployment Strategy of UAVs Based on Multi-Agent Deep Reinforcement Learning
下载PDF
导出
摘要 针对传统优化算法在求解长时间尺度内通信无人机(UAV)动态部署时复杂度过高且难以与动态环境信息匹配等缺陷,该文提出一种基于多智能体深度强化学习(MADRL)的UAV动态预部署策略。首先利用一种深度时空网络模型预测用户的预期速率需求以捕捉动态环境信息,定义用户满意度的概念以刻画用户所获得UAV提供服务的公平性,并以最大化长期总体用户满意度和最小化UAV移动及发射能耗为目标建立优化模型。其次,将上述模型转化为部分可观测马尔科夫博弈过程(POMG),并提出一种基于MADRL的H-MADDPG算法求解该POMG中轨迹规划、用户关联和功率分配的最佳决策。该H-MADDPG算法使用混合网络结构以实现对多模态输入的特征提取,并采用集中式训练-分布式执行的机制以高效地训练和执行决策。最后仿真结果证明了所提算法的有效性。 It’s challenging to use traditional optimization algorithms to solve the long-term dynamic deployment problem of Unmanned Aerial Vehicles(UAVs)due to their high complexity and difficulty in matching dynamic environment.Aiming at solving these shortcomings,a dynamic pre-deployment strategy of UAV based on Multi-Agent Deep Reinforcement Learning(MADRL)is proposed.Firstly,a deep spatio-temporal network model is used to predict the expected rate demand of users in the coverage area to capture the dynamic environment information.The concept of users’satisfaction is defined to describe the fairness of users.An optimization problem is modeled with the goal of maximizing the long-term overall users’satisfaction,minimizing the mobile and radio energy consumption of the UAVs.Secondly,the problem above is transformed into a Partially Observable Markov Game(POMG)process.An H-MADDPG algorithm based on MADRL is proposed to solve the optimal decision of trajectory design,user association and power allocation.The HMADDPG algorithm uses a hybrid network structure to extract the features of multi-modal inputs,and adopts a centralized training-distributed execution mechanism to realize efficient training and decision execution.Finally,the effectiveness of the algorithm is verified by simulation experiments.
作者 唐伦 李质萱 蒲昊 汪智平 陈前斌 TANG Lun;LI Zhixuan;PU Hao;WANG Zhiping;CHEN Qianbin(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处 《电子与信息学报》 EI CSCD 北大核心 2023年第6期2007-2015,共9页 Journal of Electronics & Information Technology
基金 国家自然科学基金(62071078) 重庆市教委科学技术研究项目(KJZD-M201800601) 川渝联合实施重点研发项目(2021YFQ0053)。
关键词 无人机通信 动态部署 部分可观测马尔科夫博弈 多智能体深度强化学习 Unmanned Aerial Vehicle(UAV)communication Dynamic deployment Partially Observable Markov Game(POMG) Multi-Agent Deep Reinforcement Learning(MADRL)
  • 相关文献

参考文献2

二级参考文献10

共引文献52

同被引文献22

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部