期刊文献+

深度确定性策略梯度算法用于无人飞行器控制 被引量:9

Deep deterministic policy gradient algorithm for UAV control
原文传递
导出
摘要 对深度确定性策略梯度算法训练智能体学习小型无人飞行器的飞行控制策略进行了探索研究。以多数据帧的速度、位置和姿态角等信息作为智能体的观察状态,舵摆角和发动机推力指令作为智能体的输出动作,飞行器的非线性模型和飞行环境作为智能体的学习环境。智能体在与环境交互过程中除了获得包含误差信息的密集惩罚外,也有达成一定目标的稀疏奖励,该设计有效提高了飞行数据的样本多样性,增强了智能体的学习效率。最后智能体实现了从位置、速度和姿态角等信息到控制量的端到端飞行控制,并进行了变航迹点、模型参数拉偏、注入扰动和故障条件下的飞行控制仿真,结果表明智能体除了能有效完成训练任务外,还能应对多种训练时未学习的飞行任务,具有优秀的泛化能力和鲁棒性,该方法具有一定的研究价值和工程参考价值。 The deep deterministic policy gradient algorithm is used to train the agent to learn the flight control strategy of a small UAV. The velocity, position and attitude angle of multi data frames are taken as the observation state of the agent, the rudder deflection angle and engine thrust command the output actions of the agent, and the nonlinear model and flight environment of the UAV the learning environment of the agent. In the interaction process between the agent and the environment, sparse rewards are provided to achieve certain goals, in addition to the dense punishment including error information, thereby effectively improving the diversity of flight data samples and enhancing the learning efficiency of the agent. The agent finally realizes the end-to-end flight control from the position, velocity and attitude angle to the control variables. In addition, the flight control simulations under the conditions of variable track point, model parameter deviation, disturbance and fault are carried out. Simulation results show that the agent can not only effectively complete the training task, but also deal with a variety of flight tasks not learned during training, showing excellent generalization ability and exhibiting certain research value and engineering reference value of the method.
作者 黄旭 柳嘉润 贾晨辉 王昭磊 张隽 HUANG Xu;LIU Jiarun;JIA Chenhui;WANG Zhaolei;ZHANG Jun(Beijing Aerospace Automatic Control Institute,Beijing 100854,China;National Key Laboratory of Science and Technology on Aerospace Intelligent Control,Beijing 100854,China)
出处 《航空学报》 EI CAS CSCD 北大核心 2021年第11期397-407,共11页 Acta Aeronautica et Astronautica Sinica
基金 国家自然科学基金(61773341)。
关键词 深度确定性策略梯度 小型无人飞行器 飞行控制 端到端 稀疏奖励 deep deterministic policy gradient small UAV flight control end to end sparse reward
  • 相关文献

参考文献5

二级参考文献56

  • 1王晶,顾维博,窦立亚.基于Leader-Follower的多无人机编队轨迹跟踪设计[J].航空学报,2020(S01):88-98. 被引量:30
  • 2马平,杨金芳,崔长春,胡胜坤.解耦控制的现状及发展[J].控制工程,2005,12(2):97-100. 被引量:57
  • 3朱家强,朱纪洪,郭锁凤,孙增圻.基于神经网络的鲁棒自适应逆飞行控制[J].控制理论与应用,2005,22(2):182-188. 被引量:21
  • 4Shim D H, Kim H J, Sastry S. A flight control system for aerial robots: algorithms and experiments[J].IFAC Control Engineering Practice, 2003, 11(12) : 1389- 1400.
  • 5Putz P. Space robotics in Europe: a survey[J]. Robotics and Autonomous Systems, 1998, 23(1/2) :3 -16.
  • 6Paul T, Krogstad T R, Gravdahl J T. Modelling of UAV formation flight using 3D potential field [J]. Simulation Modelling Practice and Theory, 2008, 16(9):1453-1462.
  • 7Augiar A P, Hespaha J P, Kokotovic P V. Path-following for nonminimum phase systems removes performance limitations[J]. IEEE Transactions on Automatic Control, 2005, 50(2): 234-239.
  • 8Jung D. Hierarchical path planning and control of a small fixed wing UAV: theory and experimental validation[D]. Atlanta: Georgia Institute of Technology, 2007.
  • 9Park S, Deyst J, How J P. A new nonlinear guidance logic for trajectory tracking[R]. AIAA-2004-4900, 2004.
  • 10Campa G, Gu Y, Seanor B, et al. Design and flight-testing of non linear formation control laws[J]. Control Engineering Practice, 2007, 15(9): 1077-1092.

共引文献56

同被引文献58

引证文献9

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部