期刊文献+

基于生成对抗网络的强化学习算法的研究 被引量:3

Research on Reinforcement Learning Algorithm Based on Generative Adversarial Network
下载PDF
导出
摘要 为了解决强化学习在训练样本中出现的整体工作效率滞后问题,文章研究提出了一种新方法。该方法将真实经验样本集作为模板,生成理论上可行的虚拟样本,通过智能体agent进行一次训练,智能体agent会将好的虚拟样本并入到真实样本集当中,提高训练样本的质量。该研究利用Open AI Gym作为仿真平台实现小车爬山仿真实验,验证了用生成对抗网络思想实现强化学习算法的有效性,对比Q学习算法,文章提出的“基于生成对抗网络的强化学习算法”(GRL)在追踪数据输出时,其输出的目标函数收敛次数大约少于40次,大大提高学习速度,改善了现有技术中存在网络滞后的学习情况。 In order to improve the overall work efficiency lag problem of reinforcement learning in training samples,this study proposes a new reinforcement learning algorithm based on generative adversarial networks.It uses the real experience sample set as a template to generate theoretically feasible virtual samples,and conducts a training through an agent,and the agent incorporates the good virtual samples into the real sample set to improve the quality of the training samples.This study uses Open AI Gym as a simulation platform to realize the simulation experiment of car climbing,and verifies the effectiveness of the reinforcement learning algorithm implemented with the idea of generative adversarial network.When tracking data output,the output objective function converges less than 40 times,which greatly improves the learning speed and improves the learning situation with network lag in the prior art.
作者 俞君杰 YU Junjie(Jiangsu Electric Power Information Technology Co. Ltd., Nanjing 210013, China)
出处 《微型电脑应用》 2022年第6期174-176,190,共4页 Microcomputer Applications
关键词 强化学习 生成对抗网络 训练样本 相对熵 函数收敛 reinforcement learning generative confrontation network training samples relative entropy function convergence
  • 相关文献

参考文献7

二级参考文献38

共引文献379

同被引文献38

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部