期刊文献+

稀疏奖励下基于课程学习的无人机空战仿真

Curriculum Learning-based Simulation of UAV Air Combat Under Sparse Rewards
下载PDF
导出
摘要 针对传统强化学习在空战环境下探索能力差和奖励稀疏的问题,提出了一种基于课程学习的分布式近端策略优化(curriculum learning distributed proximal policy optimization,CLDPPO)强化学习算法。嵌入包含专家经验知识的奖励函数,设计了离散化的动作空间,构建了局部观测与全局观测分离的演员评论家网络。通过为无人机制定进攻、防御以及综合课程,让无人机从基本课程由浅入深开始学习作战技能,阶段性提升无人机作战能力。实验结果表明:以课程学习方式训练的无人机能以一定的优势击败专家系统和主流强化学习算法,同时具有空战战术的自我学习能力,有效改善稀疏奖励的问题。 To address the limited exploration capabilities and sparse rewards of conventional reinforcement learning methods in air combat environment,a curriculum learning distributed proximal policy optimization(CLDPPO)reinforcement learning algorithm is proposed.A reward function informed by professional empirical knowledge is integrated,a discrete action space is developed,and a global observation and local value and decision network featuring separated global and local observations is established.A methodology for unmanned aerial vehicles UAVs is presented to acquire combat expertise through a sequence of fundamental courses that progressively intensify in their offensive,defensive,and comprehensive content.The experimental results show that the methodology surpasses the specialist system and the other mainstream reinforcement learning algorithms,which has the ability of the autonomous acquisition of air warfare tactics and can enhance the sparse rewards.
作者 祝靖宇 张宏立 匡敏驰 史恒 朱纪洪 乔直 周文卿 Zhu Jingyu;Zhang Hongli;Kuang Minchi;Shi Heng;Zhu Jihong;Qiao zhi;Zhou Wenqing(School of Electrical Engineering,Xinjiang University,Urumqi 830000,China;Department of Precision Instrument,Tsinghua University,Beijing 100084,China;Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
出处 《系统仿真学报》 CAS CSCD 北大核心 2024年第6期1452-1467,共16页 Journal of System Simulation
关键词 UAVS 空战 稀疏奖励 课程学习 分布式近端策略优化 UAVs air combat sparse reward curriculum learning distributed proximal policy optimization(DPPO)
  • 相关文献

参考文献9

二级参考文献65

共引文献144

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部