期刊文献+

基于强化学习的无人机安全避障与围捕制导

Reinforcement learning-based safety obstacle avoidance and capture guidance for UAV
原文传递
导出
摘要 针对无人机在受约束环境下面临绕飞障碍物与跟踪目标相互掣肘的问题,提出了一种基于强化学习的无人机安全避障与围捕制导方法。根据极坐标原理设计环绕跟踪控制器,驱使无人机在GPS拒止的情况下到达预设的圆形轨道。将环绕约束和障碍物约束转化为马尔可夫过程,以速度、径向误差、角速度误差和碰撞函数为状态空间,以控制器的补偿量为动作空间,设计考虑跟踪误差和碰撞概率的奖励函数,利用深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法对智能体进行训练,增强跟踪效果并获得避碰能力,实现无人机对静止/运动目标的环绕跟踪;此外,在训练过程中引入课程学习,将过去的学习策略转移到当前事件,与经典的随机参数设置相比,具有更快的收敛速度。最后仿真表明,所提算法可以引导无人机圆形环绕控制的同时高效规避障碍物。 To solve the problem that the unmanned aerial vehicle(UAV)faces the mutual constraint between the flying obstacle and the target tracking in the constrained environment,a method of reinforcement learning-based safety obstacle avoidance and capture guidance for UAV is proposed.According to the principle of polar coordinates,the surround tracking controller is designed to drive the UAV to a preset circular orbit in GPS-denied presence.The surround constraint and obstacle avoidance constraint are all transformed into the Markov process,taking velocity,radial error,angular velocity error and obstacle function as state space,and the compensation of control as action space.The reward function considering radial error and obstacle probability is designed.The tracking effect is enhanced and the obstacle avoidance ability is obtained by virtue of the deep deterministic policy gradient(DDPG)algorithm to train the generated agent,and then the UAV surround tracking of stationary/moving targets is realized.Additionally,the introduction of the course learning in the training process transfers past learning strategies to current events and has a faster convergence rate compared to the classical random parameter settings.Finally,the simulation results show that the proposed algorithm can guide the UAV to elliptical surround control and avoid obstacles efficiently.
作者 梅泽伟 邵星灵 刘俊 Mei Zewei;Shao Xingling;Liu Jun(North University of China,Taiyuan 030051,China)
机构地区 中北大学
出处 《战术导弹技术》 北大核心 2024年第2期93-105,共13页 Tactical Missile Technology
基金 国家自然科学基金项目(61803348)。
关键词 强化学习 避障 无人机 目标跟踪 环绕 GPS拒止 reinforcement learning obstacle avoidance unmanned aerial vehicles target tracking encircle GPS-denied
  • 相关文献

参考文献13

二级参考文献88

共引文献94

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部