摘要
麻将是一种典型的不完美信息博弈的项目,目前对于麻将问题的解决方法大多朝着深度强化学习方向进行研究,也取得了非常好的效果。但是,此类麻将AI都是建立在有高质量数据集基础上的,而大众麻将缺少关键的大量有效标注的数据集,因此,如何在对弈中快速出牌就成为主要问题。针对以上问题,对出牌动作进行研究,以启发式快速出牌为思路,提出了面向敌方胡牌牌张的蒙特卡洛评估法,将启发式快速出牌方法和蒙特卡洛评估法相结合,对每张手牌进行估值计算,通过估值分数决定每轮出牌牌张。以历史出牌次数为分界点,以此分界将博弈过程时序化为前后2个决策时段,前段采用启发式快速出牌方法,后段采用蒙特卡洛评估法。通过前后时段法分层递进决策处理过程,给出最佳出牌着法,能有效减少出牌的决策时间并降低点炮率。采用所提方法的程序在中国计算机博弈锦标赛中获得了一等奖,证明了该方法的有效性。
Mahjong is a typical game of imperfect information.Currently,most solutions to mahjong problems are studied in the direction of deep reinforcement learning,and fairly good results have been achieved.However,such mahjong AI is built on the basis of high-quality data sets,and the mass mahjong lacks a large number of critical and effectively labeled data sets.How to quickly play cards in the game has become the main problem.To address it,the paper studies the action of playing cards and puts forward the Monte Carlo evaluation method against the opponent’s cards guided by the heuristic quick card playing.By integrating the heuristic quick card playing method with Monte Carlo evaluation method,the paper evaluates each hand card and determines each round of playing cards through the valuation score.The empirical knowledge is initially employed to build a demarcation point with a certain number of historical card playing times,and the game process is divided into two decision periods.The heuristic fast card playing method is used in the first period,and the Monte Carlo evaluation method in the second period.The optimal playing method is given through the hierarchical and progressive decision-making process of the front and back time method,effectively reducing the decision time of playing cards and the point shot rate.The program using this method wins the first prize in the Chinese Computer Game Tournament,demonstrating its effectiveness.
作者
张小川
严明珠
涂飞
陈俊宇
魏乐天
ZHANG Xiaochuan;YAN Mingzhu;TU Fei;CHEN Junyu;WEI Letian(School of Liangjiang Artificial Intelligence,Chongqing University of Technology,Chonging 401120,China)
出处
《重庆理工大学学报(自然科学)》
CAS
北大核心
2024年第5期102-107,共6页
Journal of Chongqing University of Technology:Natural Science
基金
国家自然科学基金项目(60443004)
重庆市技术创新与应用发展专项项目(cstc2021jscx-dxwtBX0019)。
关键词
计算机博弈
不完美信息博弈
麻将
启发式快速出牌
蒙特卡洛评估法
computer game
imperfect information game
mahjong game
heuristic fast discard
Monte Carlo method of evaluation