摘要
介绍了一种基于分层思想的强化学习方法,即将机器人的复杂行为分解为一系列简单的行为进行离线独立学习,并分别设计了每个层次的结构、参数及函数。这种学习方法能够减小状态空间并简化强化函数的设计,从而提高了学习的速率以及学习结果的准确性,并使学习过程实现了决策的逐步求精。最后以多机器人避障为任务模型,将避障问题分解为躲避静态和动态障碍物以及向目标点靠近3个子行为分别进行学习,实现了机器人的自适应行为融合,并利用仿真实验对其有效性进行了验证。
A reinforcement learning algorithm based on the idea of partition layer was proposed that decomposing the complicated problem into a series of simple portions to be learned independently.The structures,parameters and functions of every level were designed.This learning algorithm could reduce the status space and predigest the design of reinforcement functions so as to improve the learning speed and the veracity of learning results.Also,it could realize the accuracy of the learning process step by step.Finally,the method was used for adaptive action fusion of mobile robot in an 'obstacle avoidance' task by decomposing it into avoiding static and dynamic obstacle and closing to object actions.And its efficiency was shown by simulation results.
出处
《吉林大学学报(工学版)》
EI
CAS
CSCD
北大核心
2006年第S2期108-112,共5页
Journal of Jilin University:Engineering and Technology Edition
基金
吉林省科技发展计划重大项目(20050326)
关键词
自动控制技术
避障
强化学习
Q-学习
分层学习
automatic control technology
obstacle avoidance
reinforcement learning
Q-learning
multi-level learning