期刊文献+

环境适应的高斯噪声数据增强强化学习方法

Reinforcement Learning Approach with Environment-Adaptive Gaussian Noise Augmentation
下载PDF
导出
摘要 状态向量输入的强化学习方法是一种基本的强化学习研究方向,具有广泛的应用前景。针对目前强化学习方法数据效率低下导致学习时间较长从而难以在现实环境中应用的问题,提出了一种环境适应的高斯噪声数据增强(environment-adapted Gaussian noise augmentation,EAGNA)方法,并将其作为一个模块插入到软演员-评论家(soft actor-critic,SAC)和近端策略优化(proximal policy optimization,PPO)方法中。针对任务环境中状态向量的各个元素分布范围,对每个元素添加具有不同均值和标准差的高斯噪声,从而达到增强数据的目的。在OpenAI Gym基准测试的3个基于状态向量输入的控制任务中,EAGNA较原算法获得了更高的平均回报,提高了算法的数据效率。特别是在具有复杂状态输入的Lunar Lander控制任务中,EAGNA获得的平均回报比SAC和PPO方法分别高出30.52和26.09。 The state vector input-based reinforcement learning approach is currently a fundamental research direction in the field of reinforcement learning with broad application prospects.However,the low data efficiency of current reinforcement learning methods leads to prolonged learning times,making it difficult to apply in real-world environments.To address these issues,an environment-adaptive Gaussian noise augmentation(EAGNA)method is proposed,which is integrated as a module into soft actor-critic(SAC)and proximal policy optimization(PPO)methods.This study focuses on the distribution range of each element in the state vector of the task environment and adds Gaussian noise with different means and standard deviations to each element for data augmentation.Across three state-vector-based control tasks in the OpenAI Gym benchmark,EAGNA achieved a higher average return than the original algorithm,enhancing data efficiency.Notably,in the Lunar Lander control task with complex state inputs,EAGNA outperformed the SAC and PPO methods by 30.52 and 26.09 average returns,respectively.
作者 朱乐乾 潘志松 ZHU Leqian;PAN Zhisong(College of Command&Control Engineering,Army Engineering University of PLA,Nanjing 210007,China)
出处 《陆军工程大学学报》 2024年第2期57-62,共6页 Journal of Army Engineering University of PLA
基金 国家自然科学基金(62076251)。
关键词 强化学习 数据增强 高斯噪声 状态向量输入 环境适应 reinforcement learning data augmentation Gaussian noise state vector input environment adaptation
  • 相关文献

参考文献2

二级参考文献18

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部