一种基于强化学习的三国杀多智能体博弈方法

A 2v2 Three-Country Killing Multi-Agent Game Method Based on Reinforcement Learning

下载PDF

导出

摘要深度强化学习在处理序列决策与策略探索问题上取得了很大的成功,大多从游戏中展开研究获得启发,其应用领域从单智能体场景扩展到多智能体场景中。基于纸牌的多人对战策略游戏是一种多智能体系统,但现有研究较少,且大多都来自于斗地主、德州扑克。为拓展基于纸牌的多智能体策略游戏的研究,提出了一种基于强化学习的三国杀多智能体博弈方法(SGS-MAPG),自建了以三国杀游戏为背景的2v2对战游戏场景作为实验环境,基于策略梯度的思想对合作的多个智能体建模,在其决策过程中包含了多智能体系统的团队合作与对抗,解决了多个智能体环境下的不稳定性问题。经计算机模拟对战过程,上述方法使智能体经过训练具有良好的学习决策能力,并且能够尝试获得多于基础算法的最终团队奖励,并得到高出至少12%胜率。 Deep reinforcement learning has ac hieved great success in dealing with sequent ial decision-making and strategy exploration,and most of them ar e inspired by in-game research,and its appli cation field has expanded from single-agent scenarios to multi-agent s cenarios.Solitaire-based multiplayer strat egy games are a multi-agent system,but there are few existing studies,an d most of them come from Doudi Landlord and Te xas Hold'em.In order to expand the research of multi-agent strategy games based on cards,this paper proposes a 2v 2 three-country killing multi-agent game method(SGS-MAPG)based on r einforcement learning,which builds a 2v2 bat tle game scene with the background of three-kingdom killing game as the experimental environment,models coop erative multiple agents based on the idea of strategy gradient,and in cludes teamwork and confrontation of multi-a gent systems in its decision-making process,which solves the problem of instability in multiple agent environments.Through computer simulation of the battle process,this method enables th e agent to be trained to have good learning an d decision-making ability,and can try to obtain more final team rewards than the basic algorithm,and get at least 12%higher win rate.

作者骆芙蓉王以松秦进于小民 LUO Fu-rong;WANG Yi-song;QIN Jin;YU Xiao-min(College of Computer Science and Technolo gy,Guizhou University,Guiyang Guizhou 550025,China;Institute of Artificial Intelligence of G uizhou University,Guiyang Guizhou 550025,Ch ina)

机构地区贵州大学计算机科学与技术学院贵州大学人工智能研究院

出处《计算机仿真》 2024年第7期484-490,共7页 Computer Simulation

基金国家自科学基金项目(U1836205)。

关键词深度强化学习多智能体三国杀游戏环境合作对抗 Deep reinforcement learning Mult i-agent Three kingdoms killing game enviro nment Cooperative competition

分类号 TP317 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

1征文主题预告[J].快乐语文,2024(13):23-23.
2郑敖天.现实版“三国杀”:谁是卧底?[J].环球人物,2023(24):12-16.
3孙李程,马宏宾.从兵棋推演看人工智能发展[J].军事文摘,2024(6):66-70.
4陈冰.东亚“三国杀”[J].新民周刊,2023(36):14-17.
5鲍国俊.漫谈围棋历史[J].半月选读,2023(23):74-75.
6吴云玲.手持式肺功能仪应用于社区慢阻肺初筛的可行性探索[J].中文科技期刊数据库（全文版）医药卫生,2021(8):128-129.
7贺之皋.欧盟疑忌美式“小多边”[J].瞭望,2024(25):52-52.
8刘嘉雅.英国殖民统治时期的印缅关系研究——和平合作与对抗冲突[J].西部学刊,2024(6):146-149.
9魏孔三.游戏修改过程中的游戏化观察——基于游戏《骑马与砍杀》模组制作组的研究[J].数字出版研究,2023,2(S02):79-83.
10如是生活Lives.把电竞房搬进养老院的“95后”[J].读者（原创版）,2023(5):18-20.

计算机仿真

2024年第7期

浏览历史

内容加载中请稍等...

一种基于强化学习的三国杀多智能体博弈方法

相关作者

相关机构

相关主题

浏览历史