摘要
针对欠驱动水面无人艇(USV)轨迹跟踪控制问题,提出一种基于近端策略优化(PPO)的深度强化学习轨迹跟踪控制算法.为引导控制器网络的正确收敛,构建基于长短时记忆(LSTM)网络层的深度强化学习控制器,设计了相应的状态空间和收益函数.为增强控制器的鲁棒性,生成轨迹任务数据集来模拟复杂的任务环境,以此作为深度强化学习控制器的训练样本输入.仿真结果表明:所提出的算法能有效收敛,具备扰动环境下的精确跟踪控制能力,有较大的实际应用潜力.
Aiming at the problem of underactuated unmanned surface vehicle(USV)trajectory tracking control,a deep reinforcement learning(DRL)trajectory tracking control algorithm based on proximal policy optimization(PPO)was proposed.To guide the correct convergence of the controller network,a DRL controller based on long-short-term memory(LSTM)network layer was constructed,and the corresponding state space and reward function were designed.To enhance the robustness of the controller,trajectory task datasets were generated to simulate complex task environments,which were used as training samples input for the DRL controller.Simulation results show that the proposed algorithm converges effectively,and can achieve accurate tracking control in disturbed environments,which has favorable potential for practical application.
作者
夏家伟
朱旭芳
罗亚松
吴兆东
XIA Jiawei;ZHU Xufang;LUO Yasong;WU Zhaodong(School of Weaponry Engineering,Naval University of Engineering,Wuhan 430033,China;School of Electronic Engineering,Naval University of Engineering,Wuhan 430033,China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2023年第5期74-80,共7页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
湖北省自然科学基金资助项目(2018CFC865)
中国博士后基金资助项目(2016T45686)
全军军事类研究资助项目(YJ2020B117).