摘要
针对无人分队控制,如何进行行为决策以更好地完成任务,是当前无人驾驶的一个研究热点。基于SAC算法,提出最近双经验回放SAC算法模型。该模型主要从两方面入手:1)使用最近经验采样代替随机采样;2)使用双经验池代替单经验池。实验结果表明,改进后的SAC算法相比传统SAC算法,提升了学习效率与稳定性,降低了策略网络误差,使无人分队能有更高的任务成功率。
For the control of unmanned detachment,how to make behavioral decision-making to better complete tasks is a research focus of the unmanned driving.Based on the SAC algorithm,the SAC algorithm model of recently double experience playback is proposed.This model mainly starts from two aspects:one is to replace random sampling with recent empirical sampling;The second is to use double experience pool instead of single experience pool.The experimental results show that compared with the traditional SAC algorithm,the improved SAC algorithm improves the learning efficiency and stability,reduces the policy network error,and enables the unmanned detachment to have a higher mission success rate.
作者
李海川
阳周明
王洋
崔新悦
王娜
LI Haichuan;YANG Zhouming;WANG Yang;CUI Xinyue;WANG Na(North Automatic Control Technology Institute,Taiyuan 030006,China)
出处
《火力与指挥控制》
CSCD
北大核心
2023年第6期70-75,83,共7页
Fire Control & Command Control
关键词
深度强化学习
SAC算法
最近双经验池回放
无人分队行为决策
deep reinforcement learning
soft actor-critic algorithm(SAC algorithm)
dual experience replay playback
unmanned squad behavior decision-making