摘要
针对现有空间机械臂控制方法在实际应用中调试时间长、稳定性差的问题,提出一种基于深度强化学习的控制算法。构建仿真环境用于产生数据,通过状态变量实现仿真环境与深度强化学习算法的交互,通过奖励函数实现对神经网络参数的训练,最终实现使用近端策略优化算法(Proximal Policy Optimization,PPO)控制空间机械臂将抓手移动至物体下方特定位置的目的。实验结果表明,本文提出的控制算法能够快速收敛,实现控制空间机械臂完成特定目标,并且有效降低抖动现象,提升控制的稳定性。
Aiming at solving the problems of long commissioning time and poor stability of the existing space manipulator control methods in practical applications,a control method based on deep reinforcement learning is proposed.Firstly,the simulation environment is established for generating data.Then,the interaction between the simulation environment and the deep reinforcement learning algorithm is realized through state variables,and the neural network parameters are trained through the reward functions.Finally,the purpose of controlling the space manipulator for moving the gripper to a specific position below the object by using the Proximal Policy Optimization algorithm(PPO)is achieved.The experimental results show that the control method proposed in this paper can achieve quick convergence.After the neural net-work converges,the jitter phenomenon is effectively suppressed,indicating that the algorithm improves the stability of the control.
作者
李鹤宇
林廷宇
曾贲
施国强
Li Heyu;Lin Tingyu;Zeng Bi;Shi Guoqiang(Beijing Institute of Electronic System Engineering,Beijing 100854,China;Beijing Simulation Center,Beijing 100854,China)
出处
《航天控制》
CSCD
北大核心
2020年第6期38-43,共6页
Aerospace Control
关键词
空间机械臂
神经网络
深度强化学习
近端策略优化算法(PPO)
Space manipulator
Neural network
Deep reinforcement learning
Proxinal Policy optimization algorithm(PPO)