摘要
Actor-Critic是一类具有较好性能及收敛保证的强化学习方法,然而,Agent在学习和改进策略的过程中并没有对环境的动态性进行学习,导致Actor-Critic方法的性能受到一定限制。此外,Actor-Critic方法中需要近似地表示策略以及值函数,其中状态和动作的编码方法以及参数对Actor-Critic方法有重要的影响。Tile Coding编码具有简单易用、计算时间复杂度较低等优点,因此,将Tile Coding编码与基于模型的Actor-Critic方法结合,并将所得算法应用于强化学习仿真实验。实验结果表明,所得算法具有较好的性能。
The Actor-Critic(AC) approach is a class of reinforcement learning method which has good performance and ensures convergence,but the Agent does not study the dynamic of environment in the process of learning and improving policy,which causes the performance of the AC method to be restricted to a certain extent.In addition,the AC method needs to represent the policy and value function approximately,and the encoding methods of state and action and parameters have important influence on AC method.Tile Coding has advantages of simple and low computing time complexity,so we combined the Tile Coding with Actor-Critic method based on model and applied the algorithm to the simulation experiment on reinforcement learning,and the results show that the algorithm has good performance.
出处
《计算机科学》
CSCD
北大核心
2014年第6期239-242,249,共5页
Computer Science
基金
国家自然科学基金(61070122
61373094
61070223
61103045)
江苏省自然科学基金(BK2009116)
江苏省高校自然科学研究项目(09KJA520002)资助