期刊文献+

Tjong:A transformer‐based Mahjong AI via hierarchical decision‐making and fan backward

下载PDF
导出
摘要 Mahjong,a complex game with hidden information and sparse rewards,poses significant challenges.Existing Mahjong AIs require substantial hardware resources and extensive datasets to enhance AI capabilities.The authors propose a transformer‐based Mahjong AI(Tjong)via hierarchical decision‐making.By utilising self‐attention mechanisms,Tjong effectively captures tile patterns and game dynamics,and it decouples the decision pro-cess into two distinct stages:action decision and tile decision.This design reduces de-cision complexity considerably.Additionally,a fan backward technique is proposed to address the sparse rewards by allocating reversed rewards for actions based on winning hands.Tjong consists of 15M parameters and is trained using approximately 0.5 M data over 7 days of supervised learning on a single server with 2 GPUs.The action decision achieved an accuracy of 94.63%,while the claim decision attained 98.55%and the discard decision reached 81.51%.In a tournament format,Tjong outperformed AIs(CNN,MLP,RNN,ResNet,VIT),achieving scores up to 230%higher than its opponents.Further-more,after 3 days of reinforcement learning training,it ranked within the top 1%on the leaderboard on the Botzone platform.
出处 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第4期982-995,共14页 智能技术学报(英文)
基金 National Natural Science Foundation of China,Grant/Award Numbers:62276285,62236011 Major Project of National Social Sciences Foundation of China,Grant/Award Number:20&ZD279。
  • 相关文献

参考文献2

二级参考文献8

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部