摘要
针对多无人机区域覆盖任务中的机间通信资源分配问题,提出了一种基于强化学习的多智能体动态通信资源分配模型。利用多智能体生成树覆盖方法生成任务区域内各个无人机的覆盖航线,对无人机与地面基站及无人机之间的通信链路进行建模。由于飞行环境的不确定性,将长期的资源分配问题建模为随机博弈模型,将无人机间的空-空链路视作一个智能体,每个智能体采取的动作包含选择工作频段和发送端的传输功率。在此基础上,基于双深度Q网络(DDQN)设计多智能体强化学习(MARL)模型,使得每个智能体通过奖励函数的反馈学习到最优通信资源分配策略。仿真结果表明:MARL模型能够在动态航迹下自适应选择最佳通信资源分配策略,提高时延约束下的负载交付成功率,同时降低空-空链路对空地下行链路的干扰并增大信道总容量。
This study presents a reinforcement learning-based multi-agent dynamic communication resource allocation model that addresses the issue of communication resource allocation in multi-UAV area coverage tasks.We first generate the coverage route of each UAV in the mission area by the multi-agent spanning tree coverage(MSTC)method,and model the communication link between the UAV and ground base station as well as UAV pairs.The uncertainty inherent in the flight environment motivates the modeling of the long-term resource allocation problem as a random game.T Considered an agent,the air-to-air connection between UAVs entails receiver,subchannel,and transmission power selection,among other modifications.We then design a multi-agent reinforcement learning(MARL)model based on the double deep Q-network(DDQN),where each agent learns the optimal communication resource allocation strategy through the feedback of the reward function.As shown by simulation results,the proposed MARL method can increase the overall channel capacity,decrease interference from air-to-ground uplink,and optimize communication resource allocation strategies under dynamic trajectories and delay constraints,while also improving the success rate of load delivery.
作者
卢毛毛
刘春辉
董赞亮
LU Maomao;LIU Chunhui;DONG Zanliang(School of Communication Engineering,Xidian University,Xi’an 710071,China;54th Research Institute of China Electronics Technology Group Corporation,Shijiazhuang 050299,China;Institute of Unmanned System,Beihang University,Beijing 100191,China;School of Electronics and Information Engineering,Beihang University,Beijing 100191,China)
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2024年第9期2939-2950,共12页
Journal of Beijing University of Aeronautics and Astronautics
基金
2020年度科技创新2030—“新一代人工智能”重大项目(2020AAA0108200)。
关键词
多无人机区域覆盖
动态通信资源分配
强化学习
双深度Q网络
多智能体
multi-UAV area coverage
dynamic communication resource allocation
reinforcement learning
double deep Q-network
multi-agent