期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
ADAPTIVE CGF COMMANDER BEHAVIOR MODELING THROUGH HTN GUIDED MONTE CARLO TREE SEARCH 被引量:7
1
作者 Xiao Xu Mei Yang Ge Li 《Journal of Systems Science and Systems Engineering》 SCIE EI CSCD 2018年第2期231-249,共19页
Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to prede... Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to predefined conditions, which is not suitable for complex and dynamic environments. This paper aims to apply Monte Carlo Tree Search (MCTS) for the behavior modeling of CGF commander. By look-ahead reasoning, the model generates adaptive decisions to direct the whole troops to fight. Our main work is to formulate the tree model through the state and action abstraction, and extend its expansion process to handle simultaneous and durative moves. We also employ Hierarchical Task Network (HTN) planning to guide the search, thus enhancing the search efficiency. The final implementation is tested in an infantry combat simulation where a company commander needs to control three platoons to assault and clear enemies within defined areas. Comparative results from a series of experiments demonstrate that the HTN guided MCTS commander can outperform other commanders following fixed strategies. 展开更多
关键词 monte carlo tree search Hierarchical Task Network Computer generated force Behaviormodeling
原文传递
Planning,monitoring and replanning techniques for handling abnormity in HTN-based planning and execution
2
作者 KANG Kai CHENG Kai +2 位作者 SHAO Tianhao ZHANG Hongjun ZHANG Ke 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第5期1264-1275,共12页
A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of... A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of abnormity on the plan execution.The framework consists of three parts:the hierarchical task network(HTN)planner based on Monte Carlo tree search(MCTS),hybrid plan monitoring based on forward and backward and norm-based replanning method selection.The HTN planner based on MCTS selects the optimal method for HTN compound task through pre-exploration.Based on specific objectives,it can identify the best solution to the current problem.The hybrid plan monitoring has the capability to detect the influence of abnormity on the effect of an executed action and the premise of an unexecuted action,thus trigger the replanning.The norm-based replanning selection method can measure the difference between the expected state and the actual state,and then select the best replanning algorithm.The experimental results reveal that our method can effectively deal with the influence of abnormity on the implementation of the plan and achieve the target task in an optimal way. 展开更多
关键词 hierarchical task network monte carlo tree search(MCTS) PLANNING EXECUTION abnormity
下载PDF
A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games 被引量:5
3
作者 Li ZHANG Yuxuan CHEN +4 位作者 Wei WANG Ziliang HAN Shijian Li Zhijie PAN Gang PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第5期137-150,共14页
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea... Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms. 展开更多
关键词 approximate Nash Equilibrium imperfect-information games dynamic games monte carlo tree search Neural Fictitious Self-Play reinforcement learning
原文传递
An intelligent task offloading algorithm(iTOA)for UAV edge computing network 被引量:8
4
作者 Jienan Chen Siyu Chen +3 位作者 Siyu Luo Qi Wang Bin Cao Xiaoqian Li 《Digital Communications and Networks》 SCIE 2020年第4期433-443,共11页
Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of im... Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of image or video processing,which imposes enormous pressure on the UAV computation platform.To solve this issue,in this work,we propose an intelligent Task Offloading Algorithm(iTOA)for UAV edge computing network.Compared with existing methods,iTOA is able to perceive the network’s environment intelligently to decide the offloading action based on deep Monte Calor Tree Search(MCTS),the core algorithm of Alpha Go.MCTS will simulate the offloading decision trajectories to acquire the best decision by maximizing the reward,such as lowest latency or power consumption.To accelerate the search convergence of MCTS,we also proposed a splitting Deep Neural Network(sDNN)to supply the prior probability for MCTS.The sDNN is trained by a self-supervised learning manager.Here,the training data set is obtained from iTOA itself as its own teacher.Compared with game theory and greedy search-based methods,the proposed iTOA improves service latency performance by 33%and 60%,respectively. 展开更多
关键词 Unmanned aerial vehicles(UAVs) Mobile edge computing(MEC) Intelligent task offloading algorithm(iTOA) monte carlo tree search(MCTS) Deep reinforcement learning Splitting deep neural network(sDNN)
下载PDF
A geospatial service composition approach based on MCTS with temporal-difference learning
5
作者 Zhuang Can Guo Mingqiang Xie Zhong 《High Technology Letters》 EI CAS 2021年第1期17-25,共9页
With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is ri... With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance. 展开更多
关键词 geospatial service composition reinforcement learning(RL) Markov decision process(MDP) monte carlo tree search(MCTS) temporal-difference(TD)learning
下载PDF
A version of Geiringer-like theorem for decision making in the environments with randomness and incomplete information
6
作者 Boris Mitavskiy Jonathan Rowe Chris Cannings 《International Journal of Intelligent Computing and Cybernetics》 EI 2012年第1期36-90,共55页
Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampl... Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampling algorithms that provably increase the AI potential.Design/methodology/approach-In the current paper the authors set up a mathematical framework,state and prove a version of a Geiringer-like theorem that is very well-suited for the development of Mote-Carlo sampling algorithms to cope with randomness and incomplete information to make decisions.Findings-This work establishes an important theoretical link between classical population genetics,evolutionary computation theory and model free reinforcement learning methodology.Not only may the theory explain the success of the currently existing Monte-Carlo tree sampling methodology,but it also leads to the development of novel Monte-Carlo sampling techniques guided by rigorous mathematical foundation.Practical implications-The theoretical foundations established in the current work provide guidance for the design of powerful Monte-Carlo sampling algorithms in model free reinforcement learning,to tackle numerous problems in computational intelligence.Originality/value-Establishing a Geiringer-like theorem with non-homologous recombination was a long-standing open problem in evolutionary computation theory.Apart from overcoming this challenge,in a mathematically elegant fashion and establishing a rather general and powerful version of the theorem,this work leads directly to the development of novel provably powerful algorithms for decision making in the environment involving randomness,hidden or incomplete information. 展开更多
关键词 Decision making Programming and algorithm theory monte carlo methods Markov processes Reinforcement learning Partially observable Markov decision processes monte carlo tree search Geiringer theorem Evolutionary computation theory Markov chains
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部