期刊文献+

深度强化学习的通用插件研究综述

Research review of general plugins for deep reinforcement learning
下载PDF
导出
摘要 深度强化学习的通用插件是一种可附加于大部分原生算法之上,并与其他种类插件兼容的算法类型。根据环境的不同,原生算法加入合适的插件后形成了不同的变体,并在训练速度、稳定性等方面取得了更好的效果。根据各类变体包含的通用插件在训练流程中的共性,将它们分为了6类,包括通用网络模型、内在奖励、经验回放、自我博弈、模仿学习和课程学习。综述了这6类算法中常用的通用插件,介绍了它们的应用场景和在深度强化学习中的主要作用,提出了未来的研究重点:1)提高经验利用效率;2)设计和训练出通用神经网络架构;3)提高算法在稀疏奖励环境的探索效率;4)提高算法在现实中应对各种突发状况的能力。 The general purpose plugin for deep reinforcement learning is a type of algorithm that can be attached to most native algorithms and is compatible with other kinds of plugins.Under different environments,the original algorithm forms different variants after adding appropriate plugins,and achieves better results in training speed and stability.According to the commonness of the general plugins in the training process,the variants were divided into six categories,including the general network model,intrinsic reward,experience replay,self-play,imitation learning,and curriculum learning.The general plugins commonly used in these six kinds of algorithms were reviewed,and their application scenarios and main functions in deep reinforcement learning were also briefly introduced.The future research priorities were put forward:1)to improve the efficiency of experience utilization;2)to design and train a general neural network architecture;3)to improve the efficiency of algorithm exploration in sparse reward environment;4)to improve the ability of the algorithm to deal with various emergencies in reality.
作者 钟欣见 王永华 李明 ZHONG Xinjian;WANG Yonghua;LI Ming(School of Automation,Guangdong University of Technology,Guangzhou,Guangdong 510006,China)
出处 《河北科技大学学报》 CAS 北大核心 2024年第4期362-372,共11页 Journal of Hebei University of Science and Technology
基金 国家自然科学基金(61971147) 广东省基础与应用基础研究基金(2023A1515011888)。
关键词 人工智能理论 通用插件 深度强化学习 模型设计 内在奖励 经验回放 自我博弈 artificial intelligence theory general plugin deep reinforcement learning model design intrinsic reward experience replay self-play

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部