This paper studies the problem of transfer learning in the context of reinforcement learning.We propose a novel transfer learning method that can speed up reinforcement learning with the aid of previously learnt tasks...This paper studies the problem of transfer learning in the context of reinforcement learning.We propose a novel transfer learning method that can speed up reinforcement learning with the aid of previously learnt tasks.Before performing extensive learning episodes,our method attempts to analyze the learning task via some exploration in the environment,and then attempts to reuse previous learning experience whenever it is possible and appropriate.In particular,our proposed method consists of four stages:1) subgoal discovery,2) option construction,3) similarity searching,and 4) option reusing.Especially,in order to fulfill the task of identifying similar options,we propose a novel similarity measure between options,which is built upon the intuition that similar options have similar stateaction probabilities.We examine our algorithm using extensive experiments,comparing it with existing methods.The results show that our method outperforms conventional non-transfer reinforcement learning algorithms,as well as existing transfer learning methods,by a wide margin.展开更多
文摘This paper studies the problem of transfer learning in the context of reinforcement learning.We propose a novel transfer learning method that can speed up reinforcement learning with the aid of previously learnt tasks.Before performing extensive learning episodes,our method attempts to analyze the learning task via some exploration in the environment,and then attempts to reuse previous learning experience whenever it is possible and appropriate.In particular,our proposed method consists of four stages:1) subgoal discovery,2) option construction,3) similarity searching,and 4) option reusing.Especially,in order to fulfill the task of identifying similar options,we propose a novel similarity measure between options,which is built upon the intuition that similar options have similar stateaction probabilities.We examine our algorithm using extensive experiments,comparing it with existing methods.The results show that our method outperforms conventional non-transfer reinforcement learning algorithms,as well as existing transfer learning methods,by a wide margin.