This paper investigates an unmanned aerial vehicle(UAV)-enabled maritime secure communication network,where the UAV aims to provide the communication service to a legitimate mobile vessel in the presence of multiple e...This paper investigates an unmanned aerial vehicle(UAV)-enabled maritime secure communication network,where the UAV aims to provide the communication service to a legitimate mobile vessel in the presence of multiple eavesdroppers.In this maritime communication networks(MCNs),it is challenging for the UAV to determine its trajectory on the ocean,since it cannot land or replenish energy on the sea surface,the trajectory should be pre-designed before the UAV takes off.Furthermore,the take-off location of the UAV and the sea lane of the vessel may be random,which leads to a highly dynamic environment.To address these issues,we propose two reinforcement learning schemes,Q-learning and deep deterministic policy gradient(DDPG)algorithms,to solve the discrete and continuous UAV trajectory design problem,respectively.Simulation results are provided to validate the effectiveness and superior performance of the proposed reinforcement learning schemes versus the existing schemes in the literature.Additionally,the proposed DDPG algorithm converges faster and achieves higher utilities for the UAV,compared to the Q-learning algorithm.展开更多
基金supported by the Six Categories Talent Peak of Jiangsu Province(No.KTHY-039)the Future Network Scientific Research Fund Project(No.FNSRFP-2021-YB-42)+1 种基金the Science and Technology Program of Nantong(No.JC2021016)the Key Research and Development Program of Jiangsu Province of China(No.BE2021013-1)。
文摘This paper investigates an unmanned aerial vehicle(UAV)-enabled maritime secure communication network,where the UAV aims to provide the communication service to a legitimate mobile vessel in the presence of multiple eavesdroppers.In this maritime communication networks(MCNs),it is challenging for the UAV to determine its trajectory on the ocean,since it cannot land or replenish energy on the sea surface,the trajectory should be pre-designed before the UAV takes off.Furthermore,the take-off location of the UAV and the sea lane of the vessel may be random,which leads to a highly dynamic environment.To address these issues,we propose two reinforcement learning schemes,Q-learning and deep deterministic policy gradient(DDPG)algorithms,to solve the discrete and continuous UAV trajectory design problem,respectively.Simulation results are provided to validate the effectiveness and superior performance of the proposed reinforcement learning schemes versus the existing schemes in the literature.Additionally,the proposed DDPG algorithm converges faster and achieves higher utilities for the UAV,compared to the Q-learning algorithm.