摘要
提出了一种基于蒙特卡洛树和深度神经网络的强化学习方法,用于训练一个具有较高棋力水平的五子棋算法模型。该模型利用蒙特卡洛树搜索在给定的棋盘状态下进行自我对弈,通过策略价值网络评估每个可行的落子位置的先验概率和最终价值,并选择最优的落子方案。实验结果表明该模型具有较强的泛化能力,以此设计的五子棋博弈程序在2022年中国大学生计算机博弈大赛暨中国计算机博弈锦标赛中获得一等奖。
A reinforcement learning method based on Monte Carlo trees and deep neural networks has been proposed to train a gobang algorithm model with high chess power levels.The model uses the Monte Carlo tree search to conduct self play under the given chessboard state,evaluates the prior probability and final value of each feasible drop position through the strategic value net-work,and selects the optimal drop scheme.The experimental results indicate that the model has strong generalization ability,and the Gobang game program designed based on this won first prize in the 2022 China University Computer Game Competition and China Computer Game Championship.
作者
刘克
曹杨
金张根
孔维立
Liu Ke;Cao Yang;Jin Zhanggen;Kong Weili(School of Information and Control Engineering,Liaoning Shihua University,Fushun 113001,China;School of Artificial Intelligence and Software,Liaoning Shihua University,Fushun 113001,China)
出处
《现代计算机》
2023年第19期102-105,共4页
Modern Computer
基金
辽宁省大学生创新创业训练项目(S202210148038)
辽宁省教育厅科学研究项目(LJKMZ20220754)。
关键词
五子棋
博弈
卷积神经网络
强化学习
Gobang
game
convolutional neural network
reinforcement learning