期刊文献+

一种针对坦克速度控制的深度强化学习算法 被引量:1

A Deep Reinforcement Learning Algorithmfor Tank Speed Control
下载PDF
导出
摘要 坦克的无人化将成为作战装备的未来研究方向之一,针对坦克无人驾驶如何提高智能体训练速度是当前深度强化学习领域的一大瓶颈,提出一种最近经验回放的探索策略来对传统的软行动者-评论家算法(soft actor-critic,SAC)进行改进,在训练阶段,赋予最近经验更大权重值,增大其采样概率,从而提高了训练的稳定性和收敛速度。在此基础上,基于应用环境以及作战任务设计奖励函数,提高算法的战场适用性。构建具体作战场景,对改进的算法与传统算法进行对比,结果表明,提出的算法在坦克速度控制上表现出更好的性能。 Unmanned tanks will become one of the future research directions of combat equipment.Aiming at how to improve the training speed of intelligent agents for unmanned tank driving is a major bottleneck in the current deep reinforcement learning field,an exploration strategy for replaying recent experiences is proposed to improve the traditional Soft Actor-critic(SAC)algorithm.In the training phase,the recent experience is given more weight and its sampling probability is increased,so as to improve the stability and convergence speed of training.On this basis,a reward function is designed based on the application environment and combat tasks to improve the battlefield applicability of the algorithm.The specific combat scenarios are constructed,comparing the improved algorithm is compared with the traditional algorithm,the results show that the algorithm proposed in this paper shows better performance in tank speed control.
作者 崔新悦 阳周明 赵彦东 杨霄 范玲瑜 CUI Xin-yue;YANG Zhou-ming;ZHAO Yan-dong;YANG Xiao;FAN Ling-yu(North Automatic Control Technology Institute,Taiyuan 030006,China)
出处 《火力与指挥控制》 CSCD 北大核心 2022年第4期120-125,共6页 Fire Control & Command Control
关键词 深度强化学习 软行动者 - 评论家算法 坦克速度控制 采样策略 deep reinforcement learning soft actor-critic tank speed control sampling strategy
  • 相关文献

参考文献2

二级参考文献3

共引文献17

同被引文献5

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部