摘要
针对无人机自主导航常用的端到端强化学习方法存在训练效率低、泛化能力和通用性差等问题,引入了类脑导航模型,基于长短时记忆(LSTM)神经网络构建了类脑细胞导航模型,通过整合编码无人机智能体的自运动信息,实现了网格细胞和头朝向细胞的编码,进一步将这些信息作为深度强化学习算法D3QN的状态补充表示;通过在AirSim仿真环境的实验表明,类脑导航模型的引入能够有效提高算法的训练能力和无人机智能体的导航性能,相较于原D3QN算法,首次目标固定情况下,到达目标成功率提升了2.54%,达到了97.11%;而在目标改变后继续训练的情况下,到达目标成功率为99.45%,而D3QN仅为11.46%,未能找到新的目标点;表明算法的泛化能力得到有效提升。
In response to the low training efficiency,poor generalization ability,and universality of widely used end-to-end reinforcement learning methods for autonomous navigation of UAV,a brain-inspired navigation model is introduced.Based on the long short-term memory(LSTM)neural network,a brain-inspired cell navigation model is constructed,the self-motion information of the UAV intelligent agent is integrated to encode grid cells and head direction cells,further supplement this information as the state of the deep reinforcement learning algorithm D3QN.The experiments in AirSim simulation environment show that the introduction of the brain-inspired navigation model can effectively improve the training ability of the algorithm and the navigation performance of the UAV intelligent agent.Compared with the original D3QN algorithm,the success rate of reaching the target is increased by 2.54%to 97.11%with the target first fixed.the success rate of reaching the target is 99.45%with the target continued to train after changed.The new target point misses with the success rate of the D3QN of only 11.46%.This indicates that the algorithm effectively improves generalization abilities.
作者
吴勇
彭辉
熊峰钥
WU Yong;PENG Hui;XIONG Fengyue(School of Software Engineering,Chengdu University of Information Technology,Chengdu 610228,China)
出处
《计算机测量与控制》
2024年第7期225-231,共7页
Computer Measurement &Control
基金
四川省科技计划资助项目(2019YJ0356)。
关键词
无人机
深度强化学习
类脑导航
D3QN
自主导航
UAV
deep reinforcement learning
brain-inspired navigation
D3QN
autonomous navigation