A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of...A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of abnormity on the plan execution.The framework consists of three parts:the hierarchical task network(HTN)planner based on Monte Carlo tree search(MCTS),hybrid plan monitoring based on forward and backward and norm-based replanning method selection.The HTN planner based on MCTS selects the optimal method for HTN compound task through pre-exploration.Based on specific objectives,it can identify the best solution to the current problem.The hybrid plan monitoring has the capability to detect the influence of abnormity on the effect of an executed action and the premise of an unexecuted action,thus trigger the replanning.The norm-based replanning selection method can measure the difference between the expected state and the actual state,and then select the best replanning algorithm.The experimental results reveal that our method can effectively deal with the influence of abnormity on the implementation of the plan and achieve the target task in an optimal way.展开更多
Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in thr...Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in throughput are very limited.In order to solve these problems,this paper presents a novel tag anti-collision scheme,namely adaptive hybrid search tree(AHST),by combining two algorithms of the adaptive binary-tree disassembly(ABD) and the combination query tree(CQT),in which ABD has superior tag identification velocity and CQT has optimum performance in system throughput and search timeslots.From the theoretical analysis and numerical simulations,the proposed algorithm can colligate the advantages of above algorithms,improve the system throughput and reduce the searching timeslots dramatically.展开更多
Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting...Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting single instruction multiple date (SIMD) capacity of processors to improve the performance of tree search, and proposes several improvement methods on reported SIMD tree search algorithms. Based on blocking tree structure, blocking for memory alignment and dynamic blocking prefetch are proposed to optimize the overhead of memory access. Furthermore, as a way of non-linear loop unrolling, the search branch unwinding shows that the number of branches can exceed the data width of SIMD instructions in the SIMD search algorithm. The experiments suggest that blocking optimized SIMD tree search algorithm can achieve 1.6 times response speed faster than the un-optimized algorithm.展开更多
In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,w...In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,which results in a collision and leads to the degrading of tags identifying efficiency.To improve the multiple tags’identifying efficiency due to collision,a physical layer network coding based binary search tree algorithm(PNBA)is proposed in this paper.PNBA pushes the conflicting signal information of multiple tags into a stack,which is discarded by the traditional anti-collision algorithm.In addition,physical layer network coding is exploited by PNBA to obtain unread tag information through the decoding operation of physical layer network coding using the conflicting information in the stack.Therefore,PNBA reduces the number of interactions between reader and tags,and improves the tags identification efficiency.Theoretical analysis and simulation results using MATLAB demonstrate that PNBA reduces the number of readings,and improve RFID identification efficiency.Especially,when the number of tags to be identified is 100,the average needed reading number of PNBA is 83%lower than the basic binary search tree algorithm,43%lower than reverse binary search tree algorithm,and its reading efficiency reaches 0.93.展开更多
Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of im...Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of image or video processing,which imposes enormous pressure on the UAV computation platform.To solve this issue,in this work,we propose an intelligent Task Offloading Algorithm(iTOA)for UAV edge computing network.Compared with existing methods,iTOA is able to perceive the network’s environment intelligently to decide the offloading action based on deep Monte Calor Tree Search(MCTS),the core algorithm of Alpha Go.MCTS will simulate the offloading decision trajectories to acquire the best decision by maximizing the reward,such as lowest latency or power consumption.To accelerate the search convergence of MCTS,we also proposed a splitting Deep Neural Network(sDNN)to supply the prior probability for MCTS.The sDNN is trained by a self-supervised learning manager.Here,the training data set is obtained from iTOA itself as its own teacher.Compared with game theory and greedy search-based methods,the proposed iTOA improves service latency performance by 33%and 60%,respectively.展开更多
With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is ri...With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance.展开更多
Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming...Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming, tree search and heuristic approach. A prototype of application software is developed to verify the pros and cons of various approaches展开更多
The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant...The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant U-Net style two-headed output lightweight network TibetanGoTinyNet.The lightweight convolutional neural networks and capsule structure are applied to the encoder and decoder of TibetanGoTinyNet to reduce computational burden and achieve better feature extraction results.Several autonomous self-attention mechanisms are integrated into TibetanGoTinyNet to capture the Tibetan Go board’s spatial and global information and select important channels.The training data are generated entirely from self-play games.TibetanGoTinyNet achieves 62%–78%winning rate against other four U-Net style models including Res-UNet,Res-UNet Attention,Ghost-UNet,and Ghost Capsule-UNet.It also achieves 75%winning rate in the ablation experiments on the attention mechanism with embedded positional information.The model saves about 33%of the training time with 45%–50%winning rate for different Monte–Carlo tree search(MCTS)simulation counts when migrated from 9×9 to 11×11 boards.Code for our model is available at https://github.com/paulzyy/TibetanGoTinyNet.展开更多
Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to prede...Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to predefined conditions, which is not suitable for complex and dynamic environments. This paper aims to apply Monte Carlo Tree Search (MCTS) for the behavior modeling of CGF commander. By look-ahead reasoning, the model generates adaptive decisions to direct the whole troops to fight. Our main work is to formulate the tree model through the state and action abstraction, and extend its expansion process to handle simultaneous and durative moves. We also employ Hierarchical Task Network (HTN) planning to guide the search, thus enhancing the search efficiency. The final implementation is tested in an infantry combat simulation where a company commander needs to control three platoons to assault and clear enemies within defined areas. Comparative results from a series of experiments demonstrate that the HTN guided MCTS commander can outperform other commanders following fixed strategies.展开更多
We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random var...We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random variables are got, and it is also shown that X_n, Y_n and Z_n are all asymptotically normal as n→∞by applying the contraction method.展开更多
The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the t...The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the tree. Simulation results indicate the high performance of the system. Elaborate techniques are used to achieve such a system unawilable based on any known algorithms. Methods developed in this paper may provide new insights into other problems in the area of concurrent search structure manipulation.展开更多
Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dime...Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dimensional triangular lattice model of size4+5+6+5+4. We used two types of amino acids, hydrophobic and polar, to make up the sequences, andachieved 2^(23)+2^(12) different sequences excluding the reverse symmetry sequences. The totalstring number of distinct compact structures was 219,093, excluding reflection symmetry in theself-avoiding path of length 24 triangular lattice model. Based on this model, we applied a fastsearch algorithm by constructing a cluster tree. The algorithm decreased the computation bycomputing the objective energy of non-leaf nodes. The parallel experiments proved that the fast treesearch algorithm yielded an exponential speed-up in the model of size 4+5+6+5+4. Designabilityanalysis was performed to understand the search result.展开更多
A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search ...A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search (ATS) algorithm, and multiple ATS cores need to be instantiated to achieve the wideband requirement in the 802.11 n standard. Both the ATS algorithm and the architectural considerations are explained. The latency of the detector is 0.75 μs, and the detector has a gate count of 848 k with a total of 19 parallel ATS cores. Each ATS core runs at 67 MHz. Measurement results show that compared with the floating-point ATS algorithm, the fixed-point imple- mentation achieves a loss of 0.9 dB at a BER of 10^-3.展开更多
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea...Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms.展开更多
Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside ...Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display.展开更多
Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing researc...Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing research on intersatellite data transmission has not considered the capacities of ISL bandwidth.Thus,the current study is the first to describe the intersatellite data transmission scheduling problem with capacity restrictions in GNSSs.A model conversion strategy is designed to model the aforementioned problem as a length-bounded single-path multicommodity flow problem.An integer programming model is constructed to minimize the maximal sum of flows on each intersatellite edge;this minimization is equivalent to minimizing the maximal occupied ISL bandwidth.An iterated tree search algorithm is proposed to resolve the problem,and two ranking rules are designed to guide the search.Experiments based on the BeiDou satellite constellation are designed,and results demonstrate the effectiveness of the proposed model and algorithm.展开更多
The three-dimensional packing problem is generally on how to pack a set of models into a given bounding box using the smallest packaging volume. It is known as an NP-hard problem. When discussing the packing problem i...The three-dimensional packing problem is generally on how to pack a set of models into a given bounding box using the smallest packaging volume. It is known as an NP-hard problem. When discussing the packing problem in mechanical field, the space utilization of a mechanism is low due to the constraint of mechanical joints between different mechanical parts. Although such a situation can be improved by breaking the mechanism into components at every joint, it burdens the user when reassembling the mechanism and may also reduce the service life of mechanical parts. In this paper, we propose a novel mechanism packing algorithm that deliberately considers the DOFs (degrees of freedom) of mechanical joints. With this algorithm, we construct the solution space according to each joint. While building the search tree of the splitting scheme, we do not break the joint, but move the joint. Therefore, the algorithm proposed in this paper just requires the minimal number of splits to meet the goal of space utilization. Numerical examples show that the proposed method is convenient and efficient to pack three-dimensional models into a given bounding box with high space utilization.展开更多
Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampl...Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampling algorithms that provably increase the AI potential.Design/methodology/approach-In the current paper the authors set up a mathematical framework,state and prove a version of a Geiringer-like theorem that is very well-suited for the development of Mote-Carlo sampling algorithms to cope with randomness and incomplete information to make decisions.Findings-This work establishes an important theoretical link between classical population genetics,evolutionary computation theory and model free reinforcement learning methodology.Not only may the theory explain the success of the currently existing Monte-Carlo tree sampling methodology,but it also leads to the development of novel Monte-Carlo sampling techniques guided by rigorous mathematical foundation.Practical implications-The theoretical foundations established in the current work provide guidance for the design of powerful Monte-Carlo sampling algorithms in model free reinforcement learning,to tackle numerous problems in computational intelligence.Originality/value-Establishing a Geiringer-like theorem with non-homologous recombination was a long-standing open problem in evolutionary computation theory.Apart from overcoming this challenge,in a mathematically elegant fashion and establishing a rather general and powerful version of the theorem,this work leads directly to the development of novel provably powerful algorithms for decision making in the environment involving randomness,hidden or incomplete information.展开更多
Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to s...Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to shallow search depths and suffered from significant variability in run time across probleminstances with varying complexity. To mitigate these issues, we extend this methodology to more advancedsearch algorithms based on A^(*) search. First, we develop a problem-specific heuristic based on priority list unitcommitment methods and apply this in Guided A^(*) search, reducing run time by up to 94% with negligibleimpact on operating costs. In addition, we address the run time variability issue by employing a novel anytimealgorithm, Guided IDA^(*), replacing the fixed search depth parameter with a time budget constraint. We showthat Guided IDA^(*) mitigates the run time variability of previous guided tree search algorithms and enablesfurther operating cost reductions of up to 1%.展开更多
基金supported by the National Natural Science Foundation of China(61806221).
文摘A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of abnormity on the plan execution.The framework consists of three parts:the hierarchical task network(HTN)planner based on Monte Carlo tree search(MCTS),hybrid plan monitoring based on forward and backward and norm-based replanning method selection.The HTN planner based on MCTS selects the optimal method for HTN compound task through pre-exploration.Based on specific objectives,it can identify the best solution to the current problem.The hybrid plan monitoring has the capability to detect the influence of abnormity on the effect of an executed action and the premise of an unexecuted action,thus trigger the replanning.The norm-based replanning selection method can measure the difference between the expected state and the actual state,and then select the best replanning algorithm.The experimental results reveal that our method can effectively deal with the influence of abnormity on the implementation of the plan and achieve the target task in an optimal way.
基金Supported by the National Natural Science Foundation of China(No.61401407)
文摘Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in throughput are very limited.In order to solve these problems,this paper presents a novel tag anti-collision scheme,namely adaptive hybrid search tree(AHST),by combining two algorithms of the adaptive binary-tree disassembly(ABD) and the combination query tree(CQT),in which ABD has superior tag identification velocity and CQT has optimum performance in system throughput and search timeslots.From the theoretical analysis and numerical simulations,the proposed algorithm can colligate the advantages of above algorithms,improve the system throughput and reduce the searching timeslots dramatically.
基金Project supported by the Shanghai Leading Academic Discipline Project(Grant No.J50103)the Graduate Student Innovation Foundation of Shanghai University(Grant No.SHUCX112167)
文摘Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting single instruction multiple date (SIMD) capacity of processors to improve the performance of tree search, and proposes several improvement methods on reported SIMD tree search algorithms. Based on blocking tree structure, blocking for memory alignment and dynamic blocking prefetch are proposed to optimize the overhead of memory access. Furthermore, as a way of non-linear loop unrolling, the search branch unwinding shows that the number of branches can exceed the data width of SIMD instructions in the SIMD search algorithm. The experiments suggest that blocking optimized SIMD tree search algorithm can achieve 1.6 times response speed faster than the un-optimized algorithm.
基金the National Natural Science Foundation of China under Grant 61502411Natural Science Foundation of Jiangsu Province under Grant BK20150432 and BK20151299+7 种基金Natural Science Research Project for Universities of Jiangsu Province under Grant 15KJB520034China Postdoctoral Science Foundation under Grant 2015M581843Jiangsu Provincial Qinglan ProjectTeachers Overseas Study Program of Yancheng Institute of TechnologyJiangsu Provincial Government Scholarship for Overseas StudiesTalents Project of Yancheng Institute of Technology under Grant KJC2014038“2311”Talent Project of Yancheng Institute of TechnologyOpen Fund of Modern Agricultural Resources Intelligent Management and Application Laboratory of Huzhou Normal University.
文摘In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,which results in a collision and leads to the degrading of tags identifying efficiency.To improve the multiple tags’identifying efficiency due to collision,a physical layer network coding based binary search tree algorithm(PNBA)is proposed in this paper.PNBA pushes the conflicting signal information of multiple tags into a stack,which is discarded by the traditional anti-collision algorithm.In addition,physical layer network coding is exploited by PNBA to obtain unread tag information through the decoding operation of physical layer network coding using the conflicting information in the stack.Therefore,PNBA reduces the number of interactions between reader and tags,and improves the tags identification efficiency.Theoretical analysis and simulation results using MATLAB demonstrate that PNBA reduces the number of readings,and improve RFID identification efficiency.Especially,when the number of tags to be identified is 100,the average needed reading number of PNBA is 83%lower than the basic binary search tree algorithm,43%lower than reverse binary search tree algorithm,and its reading efficiency reaches 0.93.
基金the Artificial Intelligence Key Laboratory of Sichuan Province(Nos.2019RYJ05)National Natural Science Foundation of China(Nos.61971107).
文摘Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of image or video processing,which imposes enormous pressure on the UAV computation platform.To solve this issue,in this work,we propose an intelligent Task Offloading Algorithm(iTOA)for UAV edge computing network.Compared with existing methods,iTOA is able to perceive the network’s environment intelligently to decide the offloading action based on deep Monte Calor Tree Search(MCTS),the core algorithm of Alpha Go.MCTS will simulate the offloading decision trajectories to acquire the best decision by maximizing the reward,such as lowest latency or power consumption.To accelerate the search convergence of MCTS,we also proposed a splitting Deep Neural Network(sDNN)to supply the prior probability for MCTS.The sDNN is trained by a self-supervised learning manager.Here,the training data set is obtained from iTOA itself as its own teacher.Compared with game theory and greedy search-based methods,the proposed iTOA improves service latency performance by 33%and 60%,respectively.
基金Supported by the National Natural Science Foundation of China(No.41971356,41671400,41701446)National Key Research and Development Program of China(No.2017YFB0503600,2018YFB0505500)Hubei Province Natural Science Foundation of China(No.2017CFB277)。
文摘With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance.
文摘Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming, tree search and heuristic approach. A prototype of application software is developed to verify the pros and cons of various approaches
基金the National Natural Science Foundation of China(Nos.62276285 and 62236011)the Major Projects of Social Science Fundation of China(No.20&ZD279)。
文摘The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant U-Net style two-headed output lightweight network TibetanGoTinyNet.The lightweight convolutional neural networks and capsule structure are applied to the encoder and decoder of TibetanGoTinyNet to reduce computational burden and achieve better feature extraction results.Several autonomous self-attention mechanisms are integrated into TibetanGoTinyNet to capture the Tibetan Go board’s spatial and global information and select important channels.The training data are generated entirely from self-play games.TibetanGoTinyNet achieves 62%–78%winning rate against other four U-Net style models including Res-UNet,Res-UNet Attention,Ghost-UNet,and Ghost Capsule-UNet.It also achieves 75%winning rate in the ablation experiments on the attention mechanism with embedded positional information.The model saves about 33%of the training time with 45%–50%winning rate for different Monte–Carlo tree search(MCTS)simulation counts when migrated from 9×9 to 11×11 boards.Code for our model is available at https://github.com/paulzyy/TibetanGoTinyNet.
文摘Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to predefined conditions, which is not suitable for complex and dynamic environments. This paper aims to apply Monte Carlo Tree Search (MCTS) for the behavior modeling of CGF commander. By look-ahead reasoning, the model generates adaptive decisions to direct the whole troops to fight. Our main work is to formulate the tree model through the state and action abstraction, and extend its expansion process to handle simultaneous and durative moves. We also employ Hierarchical Task Network (HTN) planning to guide the search, thus enhancing the search efficiency. The final implementation is tested in an infantry combat simulation where a company commander needs to control three platoons to assault and clear enemies within defined areas. Comparative results from a series of experiments demonstrate that the HTN guided MCTS commander can outperform other commanders following fixed strategies.
基金This work was supported by the National Natural Science Foundation of China (Grant No. 10671188)the Knowledge Innovation Program of the Chinese Academy of Sciences (Grant No. KJCX3-SYW-S02)the Special Foundation of University of Science and Technology of China
文摘We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random variables are got, and it is also shown that X_n, Y_n and Z_n are all asymptotically normal as n→∞by applying the contraction method.
文摘The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the tree. Simulation results indicate the high performance of the system. Elaborate techniques are used to achieve such a system unawilable based on any known algorithms. Methods developed in this paper may provide new insights into other problems in the area of concurrent search structure manipulation.
文摘Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dimensional triangular lattice model of size4+5+6+5+4. We used two types of amino acids, hydrophobic and polar, to make up the sequences, andachieved 2^(23)+2^(12) different sequences excluding the reverse symmetry sequences. The totalstring number of distinct compact structures was 219,093, excluding reflection symmetry in theself-avoiding path of length 24 triangular lattice model. Based on this model, we applied a fastsearch algorithm by constructing a cluster tree. The algorithm decreased the computation bycomputing the objective energy of non-leaf nodes. The parallel experiments proved that the fast treesearch algorithm yielded an exponential speed-up in the model of size 4+5+6+5+4. Designabilityanalysis was performed to understand the search result.
文摘A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search (ATS) algorithm, and multiple ATS cores need to be instantiated to achieve the wideband requirement in the 802.11 n standard. Both the ATS algorithm and the architectural considerations are explained. The latency of the detector is 0.75 μs, and the detector has a gate count of 848 k with a total of 19 parallel ATS cores. Each ATS core runs at 67 MHz. Measurement results show that compared with the floating-point ATS algorithm, the fixed-point imple- mentation achieves a loss of 0.9 dB at a BER of 10^-3.
基金National Key Research and Development Program of China(2017YFB1002503)Science and Technology Innovation 2030-“New Generation Artificial Intelligence”Major Project(2018AAA0100902),China.
文摘Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms.
基金This work was supported by the National Key Research and Development Program of China(2017YFB1002104)the National Natural Science Foundation of China(Grant No.U1811461)the Innovation Program of Institute of Computing Technology,CAS.
文摘Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display.
基金This work was supported by the National Natural Science Foundation of China(Nos.61773120 and 71901213)the Foundation for the Author of National Excellent Doctoral Dissertation of China(No.2014-92).
文摘Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing research on intersatellite data transmission has not considered the capacities of ISL bandwidth.Thus,the current study is the first to describe the intersatellite data transmission scheduling problem with capacity restrictions in GNSSs.A model conversion strategy is designed to model the aforementioned problem as a length-bounded single-path multicommodity flow problem.An integer programming model is constructed to minimize the maximal sum of flows on each intersatellite edge;this minimization is equivalent to minimizing the maximal occupied ISL bandwidth.An iterated tree search algorithm is proposed to resolve the problem,and two ranking rules are designed to guide the search.Experiments based on the BeiDou satellite constellation are designed,and results demonstrate the effectiveness of the proposed model and algorithm.
基金The work was supported by the National Key Research and Development Program of China under Grant No. 2017YFC0804401, the National Natural Science Foundation of China under Grant Nos. 61472370, 61672469, 61379079, 61322204, and 61502433, the Natural Science Foundation of Henan Province of China under Grant No. 162300410262, and the Key Research Projects of Henan Higher Education Institutions of China under Grant No. 18A413002.
文摘The three-dimensional packing problem is generally on how to pack a set of models into a given bounding box using the smallest packaging volume. It is known as an NP-hard problem. When discussing the packing problem in mechanical field, the space utilization of a mechanism is low due to the constraint of mechanical joints between different mechanical parts. Although such a situation can be improved by breaking the mechanism into components at every joint, it burdens the user when reassembling the mechanism and may also reduce the service life of mechanical parts. In this paper, we propose a novel mechanism packing algorithm that deliberately considers the DOFs (degrees of freedom) of mechanical joints. With this algorithm, we construct the solution space according to each joint. While building the search tree of the splitting scheme, we do not break the joint, but move the joint. Therefore, the algorithm proposed in this paper just requires the minimal number of splits to meet the goal of space utilization. Numerical examples show that the proposed method is convenient and efficient to pack three-dimensional models into a given bounding box with high space utilization.
基金This work has been sponsored by EPSRC EP/D003/05/1“Amorphous Computing”and EPSRC EP/I009809/1“Evolutionary Approximation Algorithms for Optimization:Algorithm Design and Complexity Analysis”Grants.
文摘Purpose-The purpose of this paper is to establish a version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte-Carlo sampling algorithms that provably increase the AI potential.Design/methodology/approach-In the current paper the authors set up a mathematical framework,state and prove a version of a Geiringer-like theorem that is very well-suited for the development of Mote-Carlo sampling algorithms to cope with randomness and incomplete information to make decisions.Findings-This work establishes an important theoretical link between classical population genetics,evolutionary computation theory and model free reinforcement learning methodology.Not only may the theory explain the success of the currently existing Monte-Carlo tree sampling methodology,but it also leads to the development of novel Monte-Carlo sampling techniques guided by rigorous mathematical foundation.Practical implications-The theoretical foundations established in the current work provide guidance for the design of powerful Monte-Carlo sampling algorithms in model free reinforcement learning,to tackle numerous problems in computational intelligence.Originality/value-Establishing a Geiringer-like theorem with non-homologous recombination was a long-standing open problem in evolutionary computation theory.Apart from overcoming this challenge,in a mathematically elegant fashion and establishing a rather general and powerful version of the theorem,this work leads directly to the development of novel provably powerful algorithms for decision making in the environment involving randomness,hidden or incomplete information.
基金supported by an Engineering and Physical Sciences Research Council research studentship(grant number:EP/R512400/1).
文摘Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to shallow search depths and suffered from significant variability in run time across probleminstances with varying complexity. To mitigate these issues, we extend this methodology to more advancedsearch algorithms based on A^(*) search. First, we develop a problem-specific heuristic based on priority list unitcommitment methods and apply this in Guided A^(*) search, reducing run time by up to 94% with negligibleimpact on operating costs. In addition, we address the run time variability issue by employing a novel anytimealgorithm, Guided IDA^(*), replacing the fixed search depth parameter with a time budget constraint. We showthat Guided IDA^(*) mitigates the run time variability of previous guided tree search algorithms and enablesfurther operating cost reductions of up to 1%.