摘要
两轮机器人自平衡控制的难点在于提高机器人达到平衡的快速性和稳定性的能力。为解决传统强化学习算法收敛速度慢,系统易发散的问题,提出一种分层强化学习算法。将目标任务分解为若干个子任务,为每个子任务寻找最优策略,当所有的子目标都收敛到最优值时,目标任务也收敛到最优。在上述算法中,报酬函数可以从启发式的环境中学习,加快对未知环境的探索,快速达到自平衡并保持稳定。对两轮机器人进行自平衡仿真实验。仿真结果表明,相对于传统的强化学习算法,应用改进算法的两轮机器人的各控制状态的收敛特性及机器人的学习性能更强,有效的提高了机器人系统的稳定性控制性能。
The difficulty of self balancing control of two wheeled robot is to improve the ability of the robot to a- chieve the balance rapidly and stably. In order to solve the problems of slow convergence speed and divergent system of traditional reinforcement learning algorithm, a hierarchical reinforcement learning algorithm was proposed in the paper. The algorithm decomposes target task into several subtasks and searches the optimal strategy for each task. When all sub-goals converge to the optimal value, the target task also converges to the optimal. In this algorithm, the compensation function can learn from the environment of heuristic, speed up the exploration of the unknown environment, achieve self balance quickly and maintain stability. The self balancing simulation experiment of two-wheeled robot was carried out using this algorithm. Simulation results show that compared with traditional reinforcement learning al- gorithm, the convergence properties of each control state and the learning performance of the two-wheeled robots are stronger by using this algorithm. Stability control of the robot system is improved.
出处
《计算机仿真》
CSCD
北大核心
2016年第7期383-387,共5页
Computer Simulation
基金
高等学校骨干教师资助计划-高等学校青年骨干教师国内访问学者进修项目(A1-5300-15-020201)
上海市高等学校科学技术发展基金-上海市高校实验技术队伍建设计划项目(A2-B-8950-13-0714)
关键词
两轮机器人
平衡控制
分层强化学习
Two-wheeled robot
Balancing control
Hierarchical reinforcement learning