期刊文献+

基于记忆探索策略的有模型深度强化学习算法 被引量:1

Model-based deep reinforcement learning algorithmbased on memory exploration strategy
下载PDF
导出
摘要 深度强化学习在各个领域中都展现出了巨大的潜力,但现有的深度强化学习算法需要大量样本才能学习到一个较好的策略,而在实际场景中,深度强化学习样本通常存在数量少、成本高等特性.因此,改善样本利用率是拓展深度强化学习应用范围的关键.除了基于模型的方法之外,智能体的探索策略也是影响样本利用率的重要因素.本文在智能体的行为策略中引入基于记忆的探索方法,其可以通过搜索过去的记忆来快速产生高回报的样本供状态价值网络学习,加快算法的训练过程.通过在仿真环境中利用基准任务来对所提算法进行评测,验证了其有效性. Deep reinforcement learning has shown great potential in various fields,but the existing deep reinforcement learning algorithms need a large number of samples to learn a better strategy,while in actual scenes,deep reinforcement learning samples usually have the characteristics of small quantity and high cost.Therefore,improving sample utilization is the key to expand the application scope of deep reinforcement learning.In addition to the model-based approaches,the exploration strategy of the agent is also an important factor affecting the sample utilization.A memory-based exploration method is introduced into the agent's behavior strategy in this paper,which can quickly generate high return sample supply state value network learning by searching past memory,and speed up the training process of the algorithm.The effectiveness of the proposed algorithm is verified by using the benchmark task in the simulation environment.
作者 倪坤 刘云龙 于丹宁 NI Kun;LIU Yun-long;YU Dan-ning(Department of Automation,Xiamen University,Xiamen 361102,Fujian China)
出处 《微电子学与计算机》 2021年第4期23-28,共6页 Microelectronics & Computer
基金 国家自然科学基金项目(61772438,61375077)。
关键词 深度强化学习 样本利用率 基于模型的方法 状态价值网络 基于记忆的探索 deep reinforcement learning sample utilization model-based approaches state value network memory-based exploration
  • 相关文献

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部