期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Autonomous air combat decision-making of UAV based on parallel self-play reinforcement learning 被引量:4
1
作者 Bo Li Jingyi Huang +4 位作者 Shuangxia Bai Zhigang Gan Shiyang Liang Neretin Evgeny Shouwen Yao 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第1期64-81,共18页
Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Crit... Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training. 展开更多
关键词 air combat decision deep reinforcement learning parallel self-play SAC algorithm UAV
下载PDF
Self-Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning
2
作者 Marco A. Wiering 《Journal of Intelligent Learning Systems and Applications》 2010年第2期57-68,共12页
A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for genera... A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for generating training games: 1) Learning by self-play, 2) Learning by playing against an expert program, and 3) Learning from viewing ex-perts play against each other. Although the third possibility generates high-quality games from the start compared to initial random games generated by self-play, the drawback is that the learning program is never allowed to test moves which it prefers. Since our expert program uses a similar evaluation function as the learning program, we also examine whether it is helpful to learn directly from the board evaluations given by the expert. We compared these methods using temporal difference methods with neural networks to learn the game of backgammon. 展开更多
关键词 Board GAMES Reinforcement LEARNING TD(λ) self-play LEARNING From Demonstration
下载PDF
Enhanced UAV Pursuit-Evasion Using Boids Modelling:A Synergistic Integration of Bird Swarm Intelligence and DRL
3
作者 Weiqiang Jin Xingwu Tian +3 位作者 Bohang Shi Biao Zhao Haibin Duan Hao Wu 《Computers, Materials & Continua》 SCIE EI 2024年第9期3523-3553,共31页
TheUAV pursuit-evasion problem focuses on the efficient tracking and capture of evading targets using unmanned aerial vehicles(UAVs),which is pivotal in public safety applications,particularly in scenarios involving i... TheUAV pursuit-evasion problem focuses on the efficient tracking and capture of evading targets using unmanned aerial vehicles(UAVs),which is pivotal in public safety applications,particularly in scenarios involving intrusion monitoring and interception.To address the challenges of data acquisition,real-world deployment,and the limited intelligence of existing algorithms in UAV pursuit-evasion tasks,we propose an innovative swarm intelligencebased UAV pursuit-evasion control framework,namely“Boids Model-based DRL Approach for Pursuit and Escape”(Boids-PE),which synergizes the strengths of swarm intelligence from bio-inspired algorithms and deep reinforcement learning(DRL).The Boids model,which simulates collective behavior through three fundamental rules,separation,alignment,and cohesion,is adopted in our work.By integrating Boids model with the Apollonian Circles algorithm,significant improvements are achieved in capturing UAVs against simple evasion strategies.To further enhance decision-making precision,we incorporate a DRL algorithm to facilitate more accurate strategic planning.We also leverage self-play training to continuously optimize the performance of pursuit UAVs.During experimental evaluation,we meticulously designed both one-on-one and multi-to-one pursuit-evasion scenarios,customizing the state space,action space,and reward function models for each scenario.Extensive simulations,supported by the PyBullet physics engine,validate the effectiveness of our proposed method.The overall results demonstrate that Boids-PE significantly enhance the efficiency and reliability of UAV pursuit-evasion tasks,providing a practical and robust solution for the real-world application of UAV pursuit-evasion missions. 展开更多
关键词 UAV pursuit-evasion swarm intelligence algorithm Boids model deep reinforcement learning self-play training
下载PDF
A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games 被引量:5
4
作者 Li ZHANG Yuxuan CHEN +4 位作者 Wei WANG Ziliang HAN Shijian Li Zhijie PAN Gang PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第5期137-150,共14页
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea... Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms. 展开更多
关键词 approximate Nash Equilibrium imperfect-information games dynamic games Monte Carlo tree search Neural Fictitious self-play reinforcement learning
原文传递
课间圈养:由来、问题与道路 被引量:3
5
作者 李长伟 《教育理论与实践》 CSSCI 北大核心 2016年第10期3-6,共4页
课间圈养日渐成为一种普遍的教育现象。这一现象的产生有两个重要原因,一是自我保存的时代精神对教育的规约,二是中国实施的独生子女政策。目前,这一现象已成为需要人们严肃对待的教育问题,课间圈养限制了学生好动的天性,不利于其身心... 课间圈养日渐成为一种普遍的教育现象。这一现象的产生有两个重要原因,一是自我保存的时代精神对教育的规约,二是中国实施的独生子女政策。目前,这一现象已成为需要人们严肃对待的教育问题,课间圈养限制了学生好动的天性,不利于其身心健康以及未来的自我保护,另外,课间圈养也阻断了孩子与自然的亲密接触,使其无法习得在与自然的交往中产生的直观的安全知识以及丰富的生命体验。走出课间圈养,根本上要依靠玩的规则的确立和落实,这就需要处理好这是什么性质的规则、谁来制定规则以及如何落实规则这三个问题。 展开更多
关键词 课间圈养 自我保存 好动的天性 玩的规则
下载PDF
高校学生用户学术数据库使用意向影响因素研究 被引量:12
6
作者 张培 《图书情报知识》 CSSCI 北大核心 2017年第5期108-119,共12页
学术数据库的持续使用是充分发挥其价值的关键,但目前对高校学生群体学术数据库使用意向影响因素的研究相对较少。本文基于技术接受模型和信息系统成功模型,融合自我效能、感知愉悦以及习惯三大理论,构建高校学生用户学术数据库使用的... 学术数据库的持续使用是充分发挥其价值的关键,但目前对高校学生群体学术数据库使用意向影响因素的研究相对较少。本文基于技术接受模型和信息系统成功模型,融合自我效能、感知愉悦以及习惯三大理论,构建高校学生用户学术数据库使用的理论模型。采用问卷调查、结构方程模型等方法收集收据并对研究假设进行验证,揭示高校学生使用学术数据库的影响因素和作用机制。研究发现,构建的理论模型能有效揭示高校学生用户学术数据库使用意向的影响因素,但是服务质量与感知易用性和感知有用性之间的关系、感知易用性与使用意向之间的关系没有得到证实。基于以上研究发现,对数据库厂商提出相应建议。 展开更多
关键词 高校学生 学术数据库 技术接受模型 系统成功模型 感知愉悦 自我效能 习惯
下载PDF
从马斯洛高峰体验理论探讨儿童游戏 被引量:8
7
作者 李东林 《四川教育学院学报》 2007年第B10期3-4,7,共3页
本文将马斯洛高峰体验理论与儿童游戏体验相联系,从体验的角度探讨儿童游戏,从自我实现获得高峰体验这个途径对指导儿童游戏提出建议。
关键词 高峰体验 游戏性体验 自我实现
下载PDF
Distributed Deep Reinforcement Learning:A Survey and a Multi-player Multi-agent Learning Toolbox
8
作者 Qiyue Yin Tongtong Yu +6 位作者 Shengqi Shen Jun Yang Meijing Zhao Wancheng Ni Kaiqi Huang Bin Liang Liang Wang 《Machine Intelligence Research》 EI CSCD 2024年第3期411-430,共20页
With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error lea... With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error learning mechanism makes deep reinforcement learning difficult to apply in a wide range of areas.Many methods have been developed for sample efficient deep reinforcement learning,such as environment modelling,experience transfer,and distributed modifications,among which distributed deep reinforcement learning has shown its potential in various applications,such as human-computer gaming and intelligent transportation.In this paper,we conclude the state of this exciting field,by comparing the classical distributed deep reinforcement learning methods and studying important components to achieve efficient distributed learning,covering single player single agent distributed deep reinforcement learning to the most complex multiple players multiple agents distributed deep reinforcement learning.Furthermore,we review recently released toolboxes that help to realize distributed deep reinforcement learning without many modifications of their non-distributed versions.By analysing their strengths and weaknesses,a multi-player multi-agent distributed deep reinforcement learning toolbox is developed and released,which is further validated on Wargame,a complex environment,showing the usability of the proposed toolbox for multiple players and multiple agents distributed deep reinforcement learning under complex games.Finally,we try to point out challenges and future trends,hoping that this brief review can provide a guide or a spark for researchers who are interested in distributed deep reinforcement learning. 展开更多
关键词 Deep reinforcement learning distributed machine learning self-play population-play TOOLBOX
原文传递
AI in Human-computer Gaming:Techniques,Challenges and Opportunities 被引量:2
9
作者 Qi-Yue Yin Jun Yang +6 位作者 Kai-Qi Huang Mei-Jing Zhao Wan-Cheng Ni Bin Liang Yan Huang Shu Wu Liang Wang 《Machine Intelligence Research》 EI CSCD 2023年第3期299-317,共19页
With the breakthrough of AlphaGo,human-computer gaming AI has ushered in a big explosion,attracting more and more researchers all over the world.As a recognized standard for testing artificial intelligence,various hum... With the breakthrough of AlphaGo,human-computer gaming AI has ushered in a big explosion,attracting more and more researchers all over the world.As a recognized standard for testing artificial intelligence,various human-computer gaming AI systems(AIs)have been developed,such as Libratus,OpenAI Five,and AlphaStar,which beat professional human players.The rapid development of human-computer gaming AIs indicates a big step for decision-making intelligence,and it seems that current techniques can handle very complex human-computer games.So,one natural question arises:What are the possible challenges of current techniques in human-computer gaming and what are the future trends?To answer the above question,in this paper,we survey recent successful game AIs,covering board game AIs,card game AIs,first-person shooting game AIs,and real-time strategy game AIs.Through this survey,we 1)compare the main difficulties among different kinds of games and the corresponding techniques utilized for achieving professional human-level AIs;2)summarize the mainstream frameworks and techniques that can be properly relied on for developing AIs for complex human-computer games;3)raise the challenges or drawbacks of current techniques in the successful AIs;and 4)try to point out future trends in human-computer gaming AIs.Finally,we hope that this brief review can provide an introduction for beginners and inspire insight for researchers in the field of AI in human-computer gaming. 展开更多
关键词 Human-computer gaming AI intelligent decision making deep reinforcement learning self-play
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部