摘要
传统的强化学习算法通常假设状态空间和行动空间是离散的,而实际上很多问题的状态空间是连续的,这就大大地限制了强化学习在实际中的应用。为克服以上不足,本文提出了一种基于核方法的强化学习算法,能直接处理具有连续状态空间的问题。最后,通过具有连续状态空间和离散行动空间的mountain car问题来验证算法。实验表明,这种算法在处理具有连续状态空间的问题时,和传统的先把状态空间离散化的方法相比,能以较少的训练数据收敛到更好的策略。
Traditional Reinforcement Learning algorithms usually assume discrete states and actions, however, many tasks inherently have continuous state spaces, which limits the practical use of reinforcement learning largely. In this paper, we develop a kernel based reinforcement learning algorithm, which solve the problems with continuous state spaces directly. Finally, we illustrated the algorithm by solving the classical mountain car task. Experiments show that the algorithm converges to good policies with relati...
出处
《微计算机信息》
北大核心
2008年第4期243-245,共3页
Control & Automation
基金
国家基础研究计划973项目名称:机器学习与数据描述编号:2004CB318103