Unmanned surface vehicles(USVs) are important autonomous marine robots that have been studied and gradually applied into practice. However, the autonomous navigation of USVs, especially the issue of obstacle avoidance...Unmanned surface vehicles(USVs) are important autonomous marine robots that have been studied and gradually applied into practice. However, the autonomous navigation of USVs, especially the issue of obstacle avoidance in complicated marine environment, is still a fundamental problem. After studying the characteristics of the complicated marine environment, we propose a novel adaptive obstacle avoidance algorithm for USVs,based on the Sarsa on-policy reinforcement learning algorithm.The proposed algorithm is composed of local avoidance module and adaptive learning module, which are organized by the “divide and conquer” strategy-based architecture. The course angle compensation strategy is proposed to offset the disturbances from sea wind and currents. In the design of payoff value function of the learning strategy, the course deviation angle and its tendency are introduced into action rewards and penalty policies. The validity of the proposed algorithm is verified by comparative experiments of simulations and sea trials in three sea-state marine environments. The results show that the algorithm can enhance the autonomous navigation capacity of USVs in complicated marine environments.展开更多
文摘Unmanned surface vehicles(USVs) are important autonomous marine robots that have been studied and gradually applied into practice. However, the autonomous navigation of USVs, especially the issue of obstacle avoidance in complicated marine environment, is still a fundamental problem. After studying the characteristics of the complicated marine environment, we propose a novel adaptive obstacle avoidance algorithm for USVs,based on the Sarsa on-policy reinforcement learning algorithm.The proposed algorithm is composed of local avoidance module and adaptive learning module, which are organized by the “divide and conquer” strategy-based architecture. The course angle compensation strategy is proposed to offset the disturbances from sea wind and currents. In the design of payoff value function of the learning strategy, the course deviation angle and its tendency are introduced into action rewards and penalty policies. The validity of the proposed algorithm is verified by comparative experiments of simulations and sea trials in three sea-state marine environments. The results show that the algorithm can enhance the autonomous navigation capacity of USVs in complicated marine environments.