期刊文献+

基于多智能体近端策略优化的无人机城市高层消防

UAV urban high-rise firefighting based on multi agent proximal policy optimization
下载PDF
导出
摘要 城市高层消防一直是具有挑战性的问题,利用无人机来执行消防任务是一个有效的解决方案。在这项工作中,我们将城市高层消防问题表述为一个部分可观测的马尔可夫决策过程(POMDP),并提出一种带有β-变分自动编码器(β-VAE)的多智能体近端策略优化(MAPPO)算法来解决它。该算法基于Actor-Critic体系结构,采用包含全局信息的评论家网络和共享信息的行动者网络。β-VAE是处理视觉感知信息的有效手段,有助于深度强化学习(DRL),使无人机因接近火灾区域并成功完成消防任务而获得奖励。为了评估文中提出的方法,基于AirSim和UrbanScene3D构建了一个大规模复杂的城市火灾环境,并将文中算法与多智能体深度确定性策略梯度(MADDPG)进行比较。实验结果表明,MAPPO算法用来解决城市高层消防问题是有效的,并且明显优于MADDPG算法。 Urban high-rise firefighting has been a challenging problem,where unmanned aerial vehicles(UAVs)is to provide an effective solution.In this work,we formulate the problem of urban high-rise firefighting as a Partially Observable Markov Decision Process(POMDP)and propose a multi agent proximal policy optimization(MAPPO)algorithm withβ-Variational auto-encoder(β-VAE)to solve it.MAPPO is a multi-agent extension of Proximal Policy Optimization(PPO)that allows agents to cooperate with each other.Based on Actor-Critic architecture,the algorithm employs a critic network containing global information and an actor network of shared information.β-VAE works as an efficient means to process visual perception information to help assist deep reinforcement learning(DRL).UAVs are rewarded for approaching the fire area and successfully completing firefighting tasks.To evaluate our proposed method,we build a large-scale complex urban fire environment based on AirSim and UrbanScene3D and compare our algorithm with multi-agent deep deterministic policy gradient(MADDPG).The results of our experiments demonstrate that MAPPO algorithm is effective in urban high-rise firefighting problem and is significantly better than MADDPG.
作者 赵小虎 吴若诚 江涵立 ZHAO Xiaohu;WU Ruocheng;JIANG Hanli(China Academy of Electronics and Information Technology,Beijing 1300041,China;Zhejiang Petrochemical Trading Center,Zhoushan 316000,China;Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation,Hefei 241002,China)
出处 《长春工业大学学报》 CAS 2023年第6期552-562,共11页 Journal of Changchun University of Technology
基金 安徽省网络空间安全态势感知与评估重点实验室开放基金项目(CSSAE-2021-004)。
关键词 无人机导航 深度强化学习 多智能体协作 UAV navigation deep reinforcement learning multi agent collaboration
  • 相关文献

参考文献4

二级参考文献20

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部