期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
基于Stackelberg策略的多Agent强化学习警力巡逻路径规划 被引量:4
1
作者 解易 顾益军 《北京理工大学学报》 EI CAS CSCD 北大核心 2017年第1期93-99,共7页
为解决现有的巡逻路径规划算法仅仅能够处理双人博弈和忽略攻击者存在的问题,提出一种新的基于多agent的强化学习算法.在给定攻击目标分布的情况下,规划任意多防御者和攻击者条件下的最优巡逻路径.考虑到防御者与攻击者选择策略的非同时... 为解决现有的巡逻路径规划算法仅仅能够处理双人博弈和忽略攻击者存在的问题,提出一种新的基于多agent的强化学习算法.在给定攻击目标分布的情况下,规划任意多防御者和攻击者条件下的最优巡逻路径.考虑到防御者与攻击者选择策略的非同时性,采用了Stackelberg强均衡策略作为每个agent选择策略的依据.为了验证算法,在多个巡逻任务中进行了测试.定量和定性的实验结果证明了算法的收敛性和有效性. 展开更多
关键词 巡逻路线规划 stackelberg强均衡策略 多AGENT 强化学习
下载PDF
Equalizer Zero-Determinant Strategy in Discounted Repeated Stackelberg Asymmetric Game 被引量:1
2
作者 CHENG Zhaoyang CHEN Guanpu HONG Yiguang 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2024年第1期184-203,共20页
This paper focuses on the performance of equalizer zero-determinant(ZD)strategies in discounted repeated Stackelberg asymmetric games.In the leader-follower adversarial scenario,the strong Stackelberg equilibrium(SSE)... This paper focuses on the performance of equalizer zero-determinant(ZD)strategies in discounted repeated Stackelberg asymmetric games.In the leader-follower adversarial scenario,the strong Stackelberg equilibrium(SSE)deriving from the opponents’best response(BR),is technically the optimal strategy for the leader.However,computing an SSE strategy may be difficult since it needs to solve a mixed-integer program and has exponential complexity in the number of states.To this end,the authors propose an equalizer ZD strategy,which can unilaterally restrict the opponent’s expected utility.The authors first study the existence of an equalizer ZD strategy with one-to-one situations,and analyze an upper bound of its performance with the baseline SSE strategy.Then the authors turn to multi-player models,where there exists one player adopting an equalizer ZD strategy.The authors give bounds of the weighted sum of opponents’s utilities,and compare it with the SSE strategy.Finally,the authors give simulations on unmanned aerial vehicles(UAVs)and the moving target defense(MTD)to verify the effectiveness of the proposed approach. 展开更多
关键词 Discounted repeated stackelberg asymmetric game equalizer zero-determinant strategy strong stackelberg equilibrium strategy
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部