期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Autonomous air combat decision-making of UAV based on parallel self-play reinforcement learning 被引量:1
1
作者 Bo Li Jingyi Huang +4 位作者 Shuangxia Bai Zhigang Gan Shiyang Liang Neretin Evgeny Shouwen Yao 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第1期64-81,共18页
Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Crit... Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training. 展开更多
关键词 air combat decision deep reinforcement learning parallel self-play SAC algorithm UAV
下载PDF
Self-Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning
2
作者 Marco A. Wiering 《Journal of Intelligent Learning Systems and Applications》 2010年第2期57-68,共12页
A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for genera... A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for generating training games: 1) Learning by self-play, 2) Learning by playing against an expert program, and 3) Learning from viewing ex-perts play against each other. Although the third possibility generates high-quality games from the start compared to initial random games generated by self-play, the drawback is that the learning program is never allowed to test moves which it prefers. Since our expert program uses a similar evaluation function as the learning program, we also examine whether it is helpful to learn directly from the board evaluations given by the expert. We compared these methods using temporal difference methods with neural networks to learn the game of backgammon. 展开更多
关键词 Board GAMES Reinforcement LEARNING TD(λ) Self-Play LEARNING From Demonstration
下载PDF
Distributed Deep Reinforcement Learning:A Survey and a Multi-player Multi-agent Learning Toolbox
3
作者 Qiyue Yin Tongtong Yu +6 位作者 Shengqi Shen Jun Yang Meijing Zhao Wancheng Ni Kaiqi Huang Bin Liang Liang Wang 《Machine Intelligence Research》 EI CSCD 2024年第3期411-430,共20页
With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error lea... With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error learning mechanism makes deep reinforcement learning difficult to apply in a wide range of areas.Many methods have been developed for sample efficient deep reinforcement learning,such as environment modelling,experience transfer,and distributed modifications,among which distributed deep reinforcement learning has shown its potential in various applications,such as human-computer gaming and intelligent transportation.In this paper,we conclude the state of this exciting field,by comparing the classical distributed deep reinforcement learning methods and studying important components to achieve efficient distributed learning,covering single player single agent distributed deep reinforcement learning to the most complex multiple players multiple agents distributed deep reinforcement learning.Furthermore,we review recently released toolboxes that help to realize distributed deep reinforcement learning without many modifications of their non-distributed versions.By analysing their strengths and weaknesses,a multi-player multi-agent distributed deep reinforcement learning toolbox is developed and released,which is further validated on Wargame,a complex environment,showing the usability of the proposed toolbox for multiple players and multiple agents distributed deep reinforcement learning under complex games.Finally,we try to point out challenges and future trends,hoping that this brief review can provide a guide or a spark for researchers who are interested in distributed deep reinforcement learning. 展开更多
关键词 Deep reinforcement learning distributed machine learning self-play population-play TOOLBOX
原文传递
AI in Human-computer Gaming:Techniques,Challenges and Opportunities
4
作者 Qi-Yue Yin Jun Yang +6 位作者 Kai-Qi Huang Mei-Jing Zhao Wan-Cheng Ni Bin Liang Yan Huang Shu Wu Liang Wang 《Machine Intelligence Research》 EI CSCD 2023年第3期299-317,共19页
With the breakthrough of AlphaGo,human-computer gaming AI has ushered in a big explosion,attracting more and more researchers all over the world.As a recognized standard for testing artificial intelligence,various hum... With the breakthrough of AlphaGo,human-computer gaming AI has ushered in a big explosion,attracting more and more researchers all over the world.As a recognized standard for testing artificial intelligence,various human-computer gaming AI systems(AIs)have been developed,such as Libratus,OpenAI Five,and AlphaStar,which beat professional human players.The rapid development of human-computer gaming AIs indicates a big step for decision-making intelligence,and it seems that current techniques can handle very complex human-computer games.So,one natural question arises:What are the possible challenges of current techniques in human-computer gaming and what are the future trends?To answer the above question,in this paper,we survey recent successful game AIs,covering board game AIs,card game AIs,first-person shooting game AIs,and real-time strategy game AIs.Through this survey,we 1)compare the main difficulties among different kinds of games and the corresponding techniques utilized for achieving professional human-level AIs;2)summarize the mainstream frameworks and techniques that can be properly relied on for developing AIs for complex human-computer games;3)raise the challenges or drawbacks of current techniques in the successful AIs;and 4)try to point out future trends in human-computer gaming AIs.Finally,we hope that this brief review can provide an introduction for beginners and inspire insight for researchers in the field of AI in human-computer gaming. 展开更多
关键词 Human-computer gaming AI intelligent decision making deep reinforcement learning self-play
原文传递
A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games 被引量:4
5
作者 Li ZHANG Yuxuan CHEN +4 位作者 Wei WANG Ziliang HAN Shijian Li Zhijie PAN Gang PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第5期137-150,共14页
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea... Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms. 展开更多
关键词 approximate Nash Equilibrium imperfect-information games dynamic games Monte Carlo tree search Neural Fictitious Self-Play reinforcement learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部