In some military application scenarios,Unmanned Aerial Vehicles(UAVs)need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted,which brings challenge...In some military application scenarios,Unmanned Aerial Vehicles(UAVs)need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted,which brings challenges for UAV autonomous navigation and collision avoidance.In this paper,an improved deep-reinforcement-learning algorithm,Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism(FRDDM-DQN),is proposed.A Faster R-CNN model(FR)is introduced and optimized to obtain the ability to extract obstacle information from images,and a new replay memory Data Deposit Mechanism(DDM)is designed to train an agent with a better performance.During training,a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes.In order to verify the performance of the proposed method,a series of experiments,including training experiments,test experiments,and typical episodes experiments,is conducted in a 3D simulation environment.Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions,and performs better compared to the FRDQN,FR-DDQN,FR-Dueling DQN,YOLO-based YDDM-DQN,and original FR outputbased FR-ODQN.展开更多
文摘In some military application scenarios,Unmanned Aerial Vehicles(UAVs)need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted,which brings challenges for UAV autonomous navigation and collision avoidance.In this paper,an improved deep-reinforcement-learning algorithm,Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism(FRDDM-DQN),is proposed.A Faster R-CNN model(FR)is introduced and optimized to obtain the ability to extract obstacle information from images,and a new replay memory Data Deposit Mechanism(DDM)is designed to train an agent with a better performance.During training,a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes.In order to verify the performance of the proposed method,a series of experiments,including training experiments,test experiments,and typical episodes experiments,is conducted in a 3D simulation environment.Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions,and performs better compared to the FRDQN,FR-DDQN,FR-Dueling DQN,YOLO-based YDDM-DQN,and original FR outputbased FR-ODQN.