期刊文献+

基于主动风险防御机制的多机器人强化学习协同对抗策略

Cooperative countermeasure strategy based on active risk defense multi-agent reinforcement learning
原文传递
导出
摘要 深度强化学习因其在多机器人系统中的高效表现,已经成为多机器人领域的研究热点.然而,当遭遇连续时变、风险未知的非结构场景时,传统方法暴露出风险防御能力差、系统安全性能脆弱的问题,未知风险将以对抗攻击的形式给多机器人的状态空间带来非线性入侵.针对这一问题,提出一种基于主动风险防御机制的多机器人强化学习方法(APMARL).首先,基于局部可观察马尔可夫博弈模型,建立多机记忆池共享的风险判别机制,通过构建风险状态指数提前预测当前行为的安全性,并根据风险预测结果自适应执行与之匹配的风险处理模式;特别地,针对有风险侵入的非安全状态,提出基于增强型注意力机制的Actor-Critic主动防御网络架构,实现对重点信息的分级增强和危险信息的有效防御.最后,通过广泛的多机协作对抗任务实验表明,具有主动风险防御机制的强化学习策略可以有效降低敌对信息的入侵风险,提高多机器人协同对抗任务的执行效率,增强策略的稳定性和安全性. Deep reinforcement learning(DRL)has become a hotspot in the field of multi-robot systems due to its efficient performance.However,when encountering unstructured environment with time-varying and unknown risks,the traditional DRL methods exposes the disadvantage of poor risk defense ability and fragile system security.The unknown risk will bring nonlinear intrusion to the state space of multi-robot systems in the form of anti attack,which will pose a serious threat to the estimation of robot motion strategy.To solve this problem,this paper proposes a multi-agent reinforcement learning method based on active risk defense mechanism(ARD-MARL).Firstly,based on the locally observable Markov game model,a risk discrimination mechanism with global communication information is established to predict the current behavior state.Secondly,in the strategy deployment stage,we build an event-triggered multi risk processing scheme to implement the matching security strategy for different levels of risk prediction.Then,aiming at the dangerous state with risk intrusion,an active defense Actor-Critic network architecture based on the enhanced attention mechanism is designed.Through magnifying the important information and restraining the threat information,a safer and more efficient motion strategy is generated.Finally,extensive experiments are carried out in multi-agent cooperative and confrontation tasks.The results show that the multi-robot reinforcement learning method with active security defense mechanism can effectively enhance the stability and anti risk ability,and improve the security of information transmissions.
作者 孙辉辉 胡春鹤 张军国 SUN Hui-hui;HU Chun-he;ZHANG Jun-guo(School of Technology,Beijing Forestry University,Beijing 100083,China;School of Mechanical and Electrical Engineering,North China Institute of Science and Technology,Langfang 065201,China;Key Lab of State Forestry and Grassland Administration for Forestry Equipment and Automation,Beijing 100083,China)
出处 《控制与决策》 EI CSCD 北大核心 2023年第5期1420-1429,共10页 Control and Decision
基金 国家自然科学基金项目(61703047) 河北省高等学校科学技术研究项目(QN2021312)。
关键词 深度强化学习 多机器人 风险防御 协同对抗 事件驱动 deep reinforcement learning multiple robots risk defense coordinated confrontation event-triggered
  • 相关文献

参考文献10

二级参考文献115

共引文献103

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部