摘要
基于深度强化学习(DRL)对无人机编队路径规划问题进行研究。针对强化学习算法模型在编队控制问题中存在收敛速度慢、奖励稀疏等不足,将人工势场法引入深度强化学习,建立了无人机编队路径规划网络训练框架。同时,根据编队控制目标设计了编队切换奖励函数进行训练。基于AirSim和UE4仿真器,搭建了无人机强化学习编队路径规划仿真训练环境,实现在威胁区域环境中的无人机编队路径规划控制。通过对比实验验证了本文算法在编队稳定性以及碰撞率等方面相较于基线算法具有更优越的性能以及更快的收敛速度。
Based on Deep Reinforcement Leaming(DRL),the path planning of UAV formation is studied.Aiming at the shortcomings of slow convergence speed and sparse rewards of reinforcement leaming algorithmmodels in the formation control problem,artificial potential field method is introduced into the deep reinforcemenileaming,and the UAV fommation path planning,network training framework is established.Meanwhile,according tothe formation control gpal,the foration switching reward funetion is desigped for taining.Based on AirSim and UE4 simulator,a UAV reinforcement leaming,formation path planning simulation training environment is built to realize the UAV formation path planning control in the threatened environment.Through comparativeexperiments,it is verified that the proposed algorithm has superior performance and faster convergence speedin terms of formation stability and collision rate compared with the baseline algorithm.
作者
周从航
李建兴
石宇静
林致睿
林航航
ZHOU Conghang;LI Jianxing;SHI Yujing;LIN Zhirui;LIN Hanghang(School of Electronic,Electrical Engineering and Physics,Fujian University of Technology,Fuzhou 350000,China;Technical Development Base of Industrial Integration Automation of Fujian Province,Fuzhou 350000,China)
出处
《电光与控制》
CSCD
北大核心
2024年第10期27-33,共7页
Electronics Optics & Control
基金
福建省自然科学基金(2020J01876)
福建工程学院科研启动基金(GY-Z21215,GY-Z21216)。