摘要
由于运动学的复杂性和环境的动态性,控制一组无人机遂行任务目前仍面临较大挑战。首先,以固定翼无人机为研究对象,考虑复杂动态环境的随机性和不确定性,提出了基于无模型深度强化学习的无人机编队协调控制方法。然后,为平衡探索和利用,将ε-greedy策略与模仿策略相结合,提出了ε-imitation动作选择策略;结合双重Q学习和竞争架构对DQN(Deep Q-Network)算法进行改进,提出了ID3QN(Imitative Dueling Double Deep Q-Network)算法以提高算法的学习效率。最后,构建高保真半实物仿真系统进行硬件在环仿真飞行实验,验证了所提算法的适应性和实用性。
Due to the complexity of kinematics and environmental dynamics,controlling a squad of fixed-wing Unmanned Aerial Vehicles(UAVs)remains a challenging problem.Considering the uncertainty of complex and dynamic environments,this paper solves the coordination control problem of UAV formation based on the model-free deep reinforcement learning algorithm.A new action selection strategy,ε-imitation strategy,is proposed by combining theε-greedy strategy and the imitation strategy to balance the exploration and the exploitation.Based on this strategy,the double Q-learning technique,and the dueling architecture,the ID3 QN(Imitative Dueling Double Deep Q-Network)algorithm is developed to boost learning efficiency.The results of the Hardware-In-Loop experiments conducted in a high-fidelity semi-physical simulation system demonstrate the adaptability and practicality of the proposed ID3QN coordinated control algorithm.
作者
相晓嘉
闫超
王菖
尹栋
XIANG Xiaojia;YAN Chao;WANG Chang;YIN Dong(College of Intelligence Science and Technology,National University of Defense Technology,Changsha 410073,China)
出处
《航空学报》
EI
CAS
CSCD
北大核心
2021年第4期414-427,共14页
Acta Aeronautica et Astronautica Sinica
基金
国家自然科学基金(61906203)
西北工业大学无人机特种技术重点实验室基金(614230110080817)。
关键词
固定翼无人机
无人机编队
协调控制
深度强化学习
神经网络
fixed-wing UAVs
UAV formation
coordination control
deep reinforcement learning
neural networks