期刊文献+

二次检索

题名
关键词
文摘
作者
第一作者
机构
刊名
分类号
参考文献
作者简介
基金资助
栏目信息
共找到19篇文章
< 1 >
每页显示 20 50 100
Planning,monitoring and replanning techniques for handling abnormity in HTN-based planning and execution
1
作者 KANG Kai CHENG Kai +2 位作者 SHAO Tianhao ZHANG Hongjun ZHANG Ke 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第5期1264-1275,共12页
A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of... A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of abnormity on the plan execution.The framework consists of three parts:the hierarchical task network(HTN)planner based on Monte Carlo tree search(MCTS),hybrid plan monitoring based on forward and backward and norm-based replanning method selection.The HTN planner based on MCTS selects the optimal method for HTN compound task through pre-exploration.Based on specific objectives,it can identify the best solution to the current problem.The hybrid plan monitoring has the capability to detect the influence of abnormity on the effect of an executed action and the premise of an unexecuted action,thus trigger the replanning.The norm-based replanning selection method can measure the difference between the expected state and the actual state,and then select the best replanning algorithm.The experimental results reveal that our method can effectively deal with the influence of abnormity on the implementation of the plan and achieve the target task in an optimal way. 展开更多
关键词 hierarchical task network Monte carlo tree search(MCTS) PLANNING EXECUTION abnormity
下载PDF
Research on the adaptive hybrid search tree anti-collision algorithm in RFID system 被引量:3
2
作者 靳晓芳 Liu Mengxuan +2 位作者 Shao Min Jin Libiao Huang Xianglin 《High Technology Letters》 EI CAS 2016年第1期107-112,共6页
Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in thr... Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in throughput are very limited.In order to solve these problems,this paper presents a novel tag anti-collision scheme,namely adaptive hybrid search tree(AHST),by combining two algorithms of the adaptive binary-tree disassembly(ABD) and the combination query tree(CQT),in which ABD has superior tag identification velocity and CQT has optimum performance in system throughput and search timeslots.From the theoretical analysis and numerical simulations,the proposed algorithm can colligate the advantages of above algorithms,improve the system throughput and reduce the searching timeslots dramatically. 展开更多
关键词 ANTI-COLLISION adaptive binary-tree disassembly( ABD) hybrid search tree DISCRIMINATION
下载PDF
Blocking optimized SIMD tree search on modern processors 被引量:2
3
作者 张倬 陆宇凡 +2 位作者 沈文枫 徐炜民 郑衍衡 《Journal of Shanghai University(English Edition)》 CAS 2011年第5期437-444,共8页
Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting... Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting single instruction multiple date (SIMD) capacity of processors to improve the performance of tree search, and proposes several improvement methods on reported SIMD tree search algorithms. Based on blocking tree structure, blocking for memory alignment and dynamic blocking prefetch are proposed to optimize the overhead of memory access. Furthermore, as a way of non-linear loop unrolling, the search branch unwinding shows that the number of branches can exceed the data width of SIMD instructions in the SIMD search algorithm. The experiments suggest that blocking optimized SIMD tree search algorithm can achieve 1.6 times response speed faster than the un-optimized algorithm. 展开更多
关键词 single instruction multiple date (SIMD) tree search binary search streaming SIMD extensions (SSE) Cell broadband engine (BE)
下载PDF
A Physical Layer Network Coding Based Tag Anti-Collision Algorithm for RFID System 被引量:3
4
作者 Cuixiang Wang Xing Shao +1 位作者 Yifan Meng Jun Gao 《Computers, Materials & Continua》 SCIE EI 2021年第1期931-945,共15页
In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,w... In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,which results in a collision and leads to the degrading of tags identifying efficiency.To improve the multiple tags’identifying efficiency due to collision,a physical layer network coding based binary search tree algorithm(PNBA)is proposed in this paper.PNBA pushes the conflicting signal information of multiple tags into a stack,which is discarded by the traditional anti-collision algorithm.In addition,physical layer network coding is exploited by PNBA to obtain unread tag information through the decoding operation of physical layer network coding using the conflicting information in the stack.Therefore,PNBA reduces the number of interactions between reader and tags,and improves the tags identification efficiency.Theoretical analysis and simulation results using MATLAB demonstrate that PNBA reduces the number of readings,and improve RFID identification efficiency.Especially,when the number of tags to be identified is 100,the average needed reading number of PNBA is 83%lower than the basic binary search tree algorithm,43%lower than reverse binary search tree algorithm,and its reading efficiency reaches 0.93. 展开更多
关键词 Radio frequency identification(RFID) tag anti-collision algorithm physical layer network coding binary search tree algorithm
下载PDF
An intelligent task offloading algorithm(iTOA)for UAV edge computing network 被引量:8
5
作者 Jienan Chen Siyu Chen +3 位作者 Siyu Luo Qi Wang Bin Cao Xiaoqian Li 《Digital Communications and Networks》 SCIE 2020年第4期433-443,共11页
Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of im... Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of image or video processing,which imposes enormous pressure on the UAV computation platform.To solve this issue,in this work,we propose an intelligent Task Offloading Algorithm(iTOA)for UAV edge computing network.Compared with existing methods,iTOA is able to perceive the network’s environment intelligently to decide the offloading action based on deep Monte Calor Tree Search(MCTS),the core algorithm of Alpha Go.MCTS will simulate the offloading decision trajectories to acquire the best decision by maximizing the reward,such as lowest latency or power consumption.To accelerate the search convergence of MCTS,we also proposed a splitting Deep Neural Network(sDNN)to supply the prior probability for MCTS.The sDNN is trained by a self-supervised learning manager.Here,the training data set is obtained from iTOA itself as its own teacher.Compared with game theory and greedy search-based methods,the proposed iTOA improves service latency performance by 33%and 60%,respectively. 展开更多
关键词 Unmanned aerial vehicles(UAVs) Mobile edge computing(MEC) Intelligent task offloading algorithm(iTOA) Monte Carlo tree search(MCTS) Deep reinforcement learning Splitting deep neural network(sDNN)
下载PDF
A geospatial service composition approach based on MCTS with temporal-difference learning
6
作者 Zhuang Can Guo Mingqiang Xie Zhong 《High Technology Letters》 EI CAS 2021年第1期17-25,共9页
With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is ri... With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance. 展开更多
关键词 geospatial service composition reinforcement learning(RL) Markov decision process(MDP) Monte Carlo tree search(MCTS) temporal-difference(TD)learning
下载PDF
Two-Dimensional Rectangular Stock CuttingProblem and Solution Methods
7
作者 Zhao Hui Yu Liang +1 位作者 Ning Tao Xi Ping School of Mechanical Engineering and Automation, Beijing University of Aeronautics and Astronautics, Beijing 100083, China Manufacturing and Production 《Computer Aided Drafting,Design and Manufacturing》 2001年第2期1-7,共7页
Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming... Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming, tree search and heuristic approach. A prototype of application software is developed to verify the pros and cons of various approaches 展开更多
关键词 rectangular stock cutting linear programming dynamic programming tree search HEURISTIC
全文增补中
TibetanGoTinyNet:a lightweight U-Net style network for zero learning of Tibetan Go
8
作者 Xiali LI Yanyin ZHANG +2 位作者 Licheng WU Yandong CHEN Junzhi YU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第7期924-937,共14页
The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant... The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant U-Net style two-headed output lightweight network TibetanGoTinyNet.The lightweight convolutional neural networks and capsule structure are applied to the encoder and decoder of TibetanGoTinyNet to reduce computational burden and achieve better feature extraction results.Several autonomous self-attention mechanisms are integrated into TibetanGoTinyNet to capture the Tibetan Go board’s spatial and global information and select important channels.The training data are generated entirely from self-play games.TibetanGoTinyNet achieves 62%–78%winning rate against other four U-Net style models including Res-UNet,Res-UNet Attention,Ghost-UNet,and Ghost Capsule-UNet.It also achieves 75%winning rate in the ablation experiments on the attention mechanism with embedded positional information.The model saves about 33%of the training time with 45%–50%winning rate for different Monte–Carlo tree search(MCTS)simulation counts when migrated from 9×9 to 11×11 boards.Code for our model is available at https://github.com/paulzyy/TibetanGoTinyNet. 展开更多
关键词 Zero learning Tibetan Go U-Net Self-attention mechanism Capsule network Monte-Carlo tree search
原文传递
ADAPTIVE CGF COMMANDER BEHAVIOR MODELING THROUGH HTN GUIDED MONTE CARLO TREE SEARCH 被引量:7
9
作者 Xiao Xu Mei Yang Ge Li 《Journal of Systems Science and Systems Engineering》 SCIE EI CSCD 2018年第2期231-249,共19页
Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to prede... Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to predefined conditions, which is not suitable for complex and dynamic environments. This paper aims to apply Monte Carlo Tree Search (MCTS) for the behavior modeling of CGF commander. By look-ahead reasoning, the model generates adaptive decisions to direct the whole troops to fight. Our main work is to formulate the tree model through the state and action abstraction, and extend its expansion process to handle simultaneous and durative moves. We also employ Hierarchical Task Network (HTN) planning to guide the search, thus enhancing the search efficiency. The final implementation is tested in an infantry combat simulation where a company commander needs to control three platoons to assault and clear enemies within defined areas. Comparative results from a series of experiments demonstrate that the HTN guided MCTS commander can outperform other commanders following fixed strategies. 展开更多
关键词 Monte Carlo tree Search Hierarchical Task Network Computer generated force Behaviormodeling
原文传递
Limiting theorems for the nodes in binary search trees 被引量:1
10
作者 LIU Jie SU Chun CHEN Yu 《Science China Mathematics》 SCIE 2008年第1期101-114,共14页
We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random var... We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random variables are got, and it is also shown that X_n, Y_n and Z_n are all asymptotically normal as n→∞by applying the contraction method. 展开更多
关键词 binary search tree NODES law of large numbers contraction method limiting distribution 60F05 05C80
原文传递
Concurrent Manipulation of Expanded AVL Trees
11
作者 章寅 许卓群 《Journal of Computer Science & Technology》 SCIE EI CSCD 1998年第4期325-336,共12页
The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the t... The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the tree. Simulation results indicate the high performance of the system. Elaborate techniques are used to achieve such a system unawilable based on any known algorithms. Methods developed in this paper may provide new insights into other problems in the area of concurrent search structure manipulation. 展开更多
关键词 AVL tree data structure binary search tree concurrent algorithm concurrency control locking protocol
原文传递
Fast Tree Search for A Triangular Lattice Model of Protein Folding
12
作者 XiaomeiLi NengchaoWang 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2004年第4期245-252,共8页
Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dime... Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dimensional triangular lattice model of size4+5+6+5+4. We used two types of amino acids, hydrophobic and polar, to make up the sequences, andachieved 2^(23)+2^(12) different sequences excluding the reverse symmetry sequences. The totalstring number of distinct compact structures was 219,093, excluding reflection symmetry in theself-avoiding path of length 24 triangular lattice model. Based on this model, we applied a fastsearch algorithm by constructing a cluster tree. The algorithm decreased the computation bycomputing the objective energy of non-leaf nodes. The parallel experiments proved that the fast treesearch algorithm yielded an exponential speed-up in the model of size 4+5+6+5+4. Designabilityanalysis was performed to understand the search result. 展开更多
关键词 triangular lattice model protein folding fast search tree DESIGNABILITY
原文传递
VLSI implementation of MIMO detection for 802.11n using a novel adaptive tree search algorithm
13
作者 尧横 鉴海防 +1 位作者 周立国 石寅 《Journal of Semiconductors》 EI CAS CSCD 2013年第10期107-113,共7页
A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search ... A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search (ATS) algorithm, and multiple ATS cores need to be instantiated to achieve the wideband requirement in the 802.11 n standard. Both the ATS algorithm and the architectural considerations are explained. The latency of the detector is 0.75 μs, and the detector has a gate count of 848 k with a total of 19 parallel ATS cores. Each ATS core runs at 67 MHz. Measurement results show that compared with the floating-point ATS algorithm, the fixed-point imple- mentation achieves a loss of 0.9 dB at a BER of 10^-3. 展开更多
关键词 multiple-input multiple-output adaptive tree search sphere decoder fixed complexity sphere decoder 802.11n
原文传递
A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games 被引量:5
14
作者 Li ZHANG Yuxuan CHEN +4 位作者 Wei WANG Ziliang HAN Shijian Li Zhijie PAN Gang PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第5期137-150,共14页
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea... Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms. 展开更多
关键词 approximate Nash Equilibrium imperfect-information games dynamic games Monte Carlo tree search Neural Fictitious Self-Play reinforcement learning
原文传递
Rich-text document styling restoration via reinforcement learning 被引量:1
15
作者 Hongwei LI Yingpeng HU +2 位作者 Yixuan CAO Ganbin ZHOU Ping LUO 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第4期93-103,共11页
Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside ... Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display. 展开更多
关键词 styling restoration monte-carlo tree search reinforcement learning richly formatted documents TABLES
原文传递
Multicommodity Flow Modeling for the Data Transmission Scheduling Problem in Navigation Satellite Systems 被引量:1
16
作者 Jungang Yan Lining Xing +1 位作者 Chao Li Zhongshan Zhang 《Complex System Modeling and Simulation》 2021年第3期232-241,共10页
Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing researc... Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing research on intersatellite data transmission has not considered the capacities of ISL bandwidth.Thus,the current study is the first to describe the intersatellite data transmission scheduling problem with capacity restrictions in GNSSs.A model conversion strategy is designed to model the aforementioned problem as a length-bounded single-path multicommodity flow problem.An integer programming model is constructed to minimize the maximal sum of flows on each intersatellite edge;this minimization is equivalent to minimizing the maximal occupied ISL bandwidth.An iterated tree search algorithm is proposed to resolve the problem,and two ranking rules are designed to guide the search.Experiments based on the BeiDou satellite constellation are designed,and results demonstrate the effectiveness of the proposed model and algorithm. 展开更多
关键词 intersatellite link navigation satellite system data transmission multicommodity flow tree search
原文传递
Mechanical Assembly Packing Problem Using Joint Constraints
17
作者 Ming-Liang Xu Ning-Bo Gu +3 位作者 Wei-Wei Xu Ming-Yuan Li Jun-Xiao Xue Bing Zhou 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第6期1162-1171,共10页
The three-dimensional packing problem is generally on how to pack a set of models into a given bounding box using the smallest packaging volume. It is known as an NP-hard problem. When discussing the packing problem i... The three-dimensional packing problem is generally on how to pack a set of models into a given bounding box using the smallest packaging volume. It is known as an NP-hard problem. When discussing the packing problem in mechanical field, the space utilization of a mechanism is low due to the constraint of mechanical joints between different mechanical parts. Although such a situation can be improved by breaking the mechanism into components at every joint, it burdens the user when reassembling the mechanism and may also reduce the service life of mechanical parts. In this paper, we propose a novel mechanism packing algorithm that deliberately considers the DOFs (degrees of freedom) of mechanical joints. With this algorithm, we construct the solution space according to each joint. While building the search tree of the splitting scheme, we do not break the joint, but move the joint. Therefore, the algorithm proposed in this paper just requires the minimal number of splits to meet the goal of space utilization. Numerical examples show that the proposed method is convenient and efficient to pack three-dimensional models into a given bounding box with high space utilization. 展开更多
关键词 NP-hard problem packing problem search tree
原文传递
A version of Geiringer-like theorem for decision making in the environments with randomness and incomplete information
18
作者 Boris Mitavskiy Jonathan Rowe Chris Cannings 《International Journal of Intelligent Computing and Cybernetics》 EI 2012年第1期36-90,共55页
Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampl... Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampling algorithms that provably increase the AI potential.Design/methodology/approach-In the current paper the authors set up a mathematical framework,state and prove a version of a Geiringer-like theorem that is very well-suited for the development of Mote-Carlo sampling algorithms to cope with randomness and incomplete information to make decisions.Findings-This work establishes an important theoretical link between classical population genetics,evolutionary computation theory and model free reinforcement learning methodology.Not only may the theory explain the success of the currently existing Monte-Carlo tree sampling methodology,but it also leads to the development of novel Monte-Carlo sampling techniques guided by rigorous mathematical foundation.Practical implications-The theoretical foundations established in the current work provide guidance for the design of powerful Monte-Carlo sampling algorithms in model free reinforcement learning,to tackle numerous problems in computational intelligence.Originality/value-Establishing a Geiringer-like theorem with non-homologous recombination was a long-standing open problem in evolutionary computation theory.Apart from overcoming this challenge,in a mathematically elegant fashion and establishing a rather general and powerful version of the theorem,this work leads directly to the development of novel provably powerful algorithms for decision making in the environment involving randomness,hidden or incomplete information. 展开更多
关键词 Decision making Programming and algorithm theory Monte Carlo methods Markov processes Reinforcement learning Partially observable Markov decision processes Monte Carlo tree search Geiringer theorem Evolutionary computation theory Markov chains
原文传递
Reinforcement learning and A^(*)search for the unit commitment problem Patrick de Mars^(∗),Aidan O’Sullivan
19
作者 Patrick de Mars Aidan O’Sullivan 《Energy and AI》 2022年第3期172-181,共10页
Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to s... Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to shallow search depths and suffered from significant variability in run time across probleminstances with varying complexity. To mitigate these issues, we extend this methodology to more advancedsearch algorithms based on A^(*) search. First, we develop a problem-specific heuristic based on priority list unitcommitment methods and apply this in Guided A^(*) search, reducing run time by up to 94% with negligibleimpact on operating costs. In addition, we address the run time variability issue by employing a novel anytimealgorithm, Guided IDA^(*), replacing the fixed search depth parameter with a time budget constraint. We showthat Guided IDA^(*) mitigates the run time variability of previous guided tree search algorithms and enablesfurther operating cost reductions of up to 1%. 展开更多
关键词 Unit commitment Reinforcement learning tree search Power systems
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部