The pursuit-evasion game models the strategic interaction among players, attracting attention in many realistic scenarios, such as missile guidance, unmanned aerial vehicles, and target defense. Existing studies mainl...The pursuit-evasion game models the strategic interaction among players, attracting attention in many realistic scenarios, such as missile guidance, unmanned aerial vehicles, and target defense. Existing studies mainly concentrate on the cooperative pursuit of multiple players in two-dimensional pursuit-evasion games. However, these approaches can hardly be applied to practical situations where players usually move in three-dimensional space with a three-degree-of-freedom control. In this paper,we make the first attempt to investigate the equilibrium strategy of the realistic pursuit-evasion game, in which the pursuer follows a three-degree-of-freedom control, and the evader moves freely. First, we describe the pursuer's three-degree-of-freedom control and the evader's relative coordinate. We then rigorously derive the equilibrium strategy by solving the retrogressive path equation according to the Hamilton-Jacobi-Bellman-Isaacs(HJBI) method, which divides the pursuit-evasion process into the navigation and acceleration phases. Besides, we analyze the maximum allowable speed for the pursuer to capture the evader successfully and provide the strategy with which the evader can escape when the pursuer's speed exceeds the threshold. We further conduct comparison tests with various unilateral deviations to verify that the proposed strategy forms a Nash equilibrium.展开更多
TheUAV pursuit-evasion problem focuses on the efficient tracking and capture of evading targets using unmanned aerial vehicles(UAVs),which is pivotal in public safety applications,particularly in scenarios involving i...TheUAV pursuit-evasion problem focuses on the efficient tracking and capture of evading targets using unmanned aerial vehicles(UAVs),which is pivotal in public safety applications,particularly in scenarios involving intrusion monitoring and interception.To address the challenges of data acquisition,real-world deployment,and the limited intelligence of existing algorithms in UAV pursuit-evasion tasks,we propose an innovative swarm intelligencebased UAV pursuit-evasion control framework,namely“Boids Model-based DRL Approach for Pursuit and Escape”(Boids-PE),which synergizes the strengths of swarm intelligence from bio-inspired algorithms and deep reinforcement learning(DRL).The Boids model,which simulates collective behavior through three fundamental rules,separation,alignment,and cohesion,is adopted in our work.By integrating Boids model with the Apollonian Circles algorithm,significant improvements are achieved in capturing UAVs against simple evasion strategies.To further enhance decision-making precision,we incorporate a DRL algorithm to facilitate more accurate strategic planning.We also leverage self-play training to continuously optimize the performance of pursuit UAVs.During experimental evaluation,we meticulously designed both one-on-one and multi-to-one pursuit-evasion scenarios,customizing the state space,action space,and reward function models for each scenario.Extensive simulations,supported by the PyBullet physics engine,validate the effectiveness of our proposed method.The overall results demonstrate that Boids-PE significantly enhance the efficiency and reliability of UAV pursuit-evasion tasks,providing a practical and robust solution for the real-world application of UAV pursuit-evasion missions.展开更多
Current successes in artificial intelligence domain have revitalized interest in spacecraft pursuit-evasion game,which is an interception problem with a non-cooperative maneuvering target.The paper presents an automat...Current successes in artificial intelligence domain have revitalized interest in spacecraft pursuit-evasion game,which is an interception problem with a non-cooperative maneuvering target.The paper presents an automated machine learning(AutoML)based method to generate optimal trajectories in long-distance scenarios.Compared with conventional deep neural network(DNN)methods,the proposed method dramatically reduces the reliance on manual intervention and machine learning expertise.Firstly,based on differential game theory and costate normalization technique,the trajectory optimization problem is formulated under the assumption of continuous thrust.Secondly,the AutoML technique based on sequential model-based optimization(SMBO)framework is introduced to automate DNN design in deep learning process.If recommended DNN architecture exists,the tree-structured Parzen estimator(TPE)is used,otherwise the efficient neural architecture search(NAS)with network morphism is used.Thus,a novel trajectory optimization method with high computational efficiency is achieved.Finally,numerical results demonstrate the feasibility and efficiency of the proposed method.展开更多
With the development of space rendezvous and proximity operations(RPO)in recent years,the scenarios with noncooperative spacecraft are attracting the attention of more and more researchers.A method based on the costat...With the development of space rendezvous and proximity operations(RPO)in recent years,the scenarios with noncooperative spacecraft are attracting the attention of more and more researchers.A method based on the costate normalization technique and deep neural networks is presented to generate the optimal guidance law for free-time orbital pursuit-evasion game.Firstly,the 24-dimensional problem given by differential game theory is transformed into a three-parameter optimization problem through the dimension-reduction method which guarantees the uniqueness of solution for the specific scenario.Secondly,a close-loop interactive mechanism involving feedback is introduced to deep neural networks for generating precise initial solution.Thus the optimal guidance law is obtained efficiently and stably with the application of optimization algorithm initialed by the deep neural networks.Finally,the results of the comparison with another two methods and Monte Carlo simulation demonstrate the efficiency and robustness of the proposed optimal guidance method.展开更多
Miss distance is a critical parameter of assessing the performance for highly maneuvering targets interception(HMTI). In a realistic terminal guidance system, the control of pursuer depends on the estimate of unknown ...Miss distance is a critical parameter of assessing the performance for highly maneuvering targets interception(HMTI). In a realistic terminal guidance system, the control of pursuer depends on the estimate of unknown state, thus the miss distance becomes a random variable with a prior unknown distribution. Currently, such a distribution is mainly evaluated by the method of Monte Carlo simulation. In this paper, by integrating the estimation error model of zero-effort miss distance(ZEM) obtained by our previous work, an analytic method for solving the distribution of miss distance is proposed, in which the system is presumed to use a bang-bang control strategy. By comparing with the results of Monte Carlo simulations under four different types of disturbances(maneuvers), the correctness of the proposed method is validated. Results of this paper provide a powerful tool for the design, analysis and performance evaluation of guidance system.展开更多
This work is inspired by a stealth pursuit behavior called motion camouflage whereby a pursuer approaches an evader while the pursuer camouflages itself against a predetermined background.We formulate the spacecraft p...This work is inspired by a stealth pursuit behavior called motion camouflage whereby a pursuer approaches an evader while the pursuer camouflages itself against a predetermined background.We formulate the spacecraft pursuit-evasion problem as a stealth pursuit strategy of motion camouflage,in which the pursuer tries to minimize a motion camouflage index defined in this paper.The Euler-Hill reference frame whose origin is set on the circular reference orbit is used to describe the dynamics.Based on the rule of motion camouflage,a guidance strategy in open-loop form to achieve motion camouflage index is derived in which the pursuer lies on the camouflage constraint line connecting the central spacecraft and evader.In order to dispose of the dependence on the evader acceleration in the open-loop guidance strategy,we further consider the motion camouflage pursuit problem within an infinite-horizon nonlinear quadratic differential game.The saddle point solution to the game is derived by using the state-dependent Riccati equation method,and the resulting closed-loop guidance strategy is effective in achieving motion camouflage.Simulations are performed to demonstrate the capabilities of the proposed guidance strategies for the pursuit–evasion game scenario.展开更多
Qualitative spacecraft pursuit-evasion problem which focuses on feasibility is rarely studied because of high-dimensional dynamics,intractable terminal constraints and heavy computational cost.In this paper,A physics-...Qualitative spacecraft pursuit-evasion problem which focuses on feasibility is rarely studied because of high-dimensional dynamics,intractable terminal constraints and heavy computational cost.In this paper,A physics-informed framework is proposed for the problem,providing an intuitive method for spacecraft threat relationship determination,situation assessment,mission feasibility analysis and orbital game rules summarization.For the first time,situation adjustment suggestions can be provided for the weak player in orbital game.First,a dimension-reduction dynamics is derived in the line-of-sight rotation coordinate system and the qualitative model is determined,reducing complexity and avoiding the difficulty of target set presentation caused by individual modeling.Second,the Backwards Reachable Set(BRS)of the target set is used for state space partition and capture zone presentation.Reverse-time analysis can eliminate the influence of changeable initial state and enable the proposed framework to analyze plural situations simultaneously.Third,a time-dependent Hamilton-Jacobi-Isaacs(HJI)Partial Differential Equation(PDE)is established to describe BRS evolution driven by dimension-reduction dynamics,based on level set method.Then,Physics-Informed Neural Networks(PINNs)are extended to HJI PDE final value problem,supporting orbital game rules summarization through capture zone evolution analysis.Finally,numerical results demonstrate the feasibility and efficiency of the proposed framework.展开更多
In this paper,the pursuit-evasion game with state and control constraints is solved to achieve the Nash equilibrium of both the pursuer and the evader with an iterative self-play technique.Under the condition where th...In this paper,the pursuit-evasion game with state and control constraints is solved to achieve the Nash equilibrium of both the pursuer and the evader with an iterative self-play technique.Under the condition where the Hamiltonian formed by means of Pontryagin’s maximum principle has the unique solution,it can be proven that the iterative control law converges to the Nash equilibrium solution.However,the strong nonlinearity of the ordinary differential equations formulated by Pontryagin’s maximum principle makes the control policy difficult to figured out.Moreover the system dynamics employed in this manuscript contains a high dimensional state vector with constraints.In practical applications,such as the control of aircraft,the provided overload is limited.Therefore,in this paper,we consider the optimal strategy of pursuit-evasion games with constant constraint on the control,while some state vectors are restricted by the function of the input.To address the challenges,the optimal control problems are transformed into nonlinear programming problems through the direct collocation method.Finally,two numerical cases of the aircraft pursuit-evasion scenario are given to demonstrate the effectiveness of the presented method to obtain the optimal control of both the pursuer and the evader.展开更多
For complex functions to emerge in artificial systems,it is important to understand the intrinsic mechanisms of biological swarm behaviors in nature.In this paper,we present a comprehensive survey of pursuit–evasion,...For complex functions to emerge in artificial systems,it is important to understand the intrinsic mechanisms of biological swarm behaviors in nature.In this paper,we present a comprehensive survey of pursuit–evasion,which is a critical problem in biological groups.First,we review the problem of pursuit–evasion from three different perspectives:game theory,control theory and artificial intelligence,and bio-inspired perspectives.Then we provide an overview of the research on pursuit–evasion problems in biological systems and artificial systems.We summarize predator pursuit behavior and prey evasion behavior as predator–prey behavior.Next,we analyze the application of pursuit–evasion in artificial systems from three perspectives,i.e.,strong pursuer group vs.weak evader group,weak pursuer group vs.strong evader group,and equal-ability group.Finally,relevant prospects for future pursuit–evasion challenges are discussed.This survey provides new insights into the design of multi-agent and multi-robot systems to complete complex hunting tasks in uncertain dynamic scenarios.展开更多
The orbital pursuit-evasion game is typically formulated as a complete-information game,which assumes the payoff functions of the two players are common knowledge.However,realistic pursuit-evasion games typically have...The orbital pursuit-evasion game is typically formulated as a complete-information game,which assumes the payoff functions of the two players are common knowledge.However,realistic pursuit-evasion games typically have incomplete information,in which the lack of payoff information limits the player’s ability to play optimally.To address this problem,this paper proposes a currently optimal escape strategy based on estimation for the evader.In this strategy,the currently optimal evasive controls are first derived based on the evader’s guess of the pursuer’s payoff weightings.Then an online parameter estimation method based on a modified strong tracking unscented Kalman filter is employed to modify the guess and update the strategy during the game.As the estimation becomes accurate,the currently optimal strategy gets closer to the actually optimal strategy.Simulation results show the proposed strategy can achieve optimal evasive controls progressively and the evader’s payoff of the strategy is lower than that of the zero-sum escape strategy.Meanwhile,the proposed strategy is also effective in the case where the pursuer changes its payoff function halfway during the game.展开更多
In practical combat scenario,the cooperative intercept strategies are often carefully designed,and it is challenging for the hypersonic vehicles to achieve successful evasion.Based on the analysis,it can be found that...In practical combat scenario,the cooperative intercept strategies are often carefully designed,and it is challenging for the hypersonic vehicles to achieve successful evasion.Based on the analysis,it can be found that if several Successive Pursuers come from the Same Direction(SPSD)and flight with a proper spacing,the evasion difficulty may increase greatly.To address this problem,we focus on the evasion guidance strategy design for the Air-breathing Hypersonic Vehicles(AHVs)under the SPSD combat scenario.In order to avoid the induced influence on the scramjet,altitude and speed of the vehicle,the lateral maneuver and evasion are employed.To guarantee the remnant maneuver ability,the concept of specified miss distance is introduced and utilized to generate the guidance command for the AHV.In the framework of constrained optimal control,the analytical expression of the evasion command is derived,and the constraints of the overload can be ensured to be never violated.In fact,by analyzing the spacing of the pursers,it can be classified whether the cooperative pursuit is formed.For the coordination-unformed multiple pursers,the evasion can be achieved lightly by the proposed strategy.If the coordination is formed,the proposed method will generate a large reverse direction maneuver,and the successful evasion can be maintained as a result.The performance of the proposed algorithms is tested in numerical simulations.展开更多
In this paper,we study a pursuit-evasion differential game problem in the Hilbert space L2.Dynamics of countable number of pursuers and evader expressed as nth-order differential equa-tions with geometric constraints ...In this paper,we study a pursuit-evasion differential game problem in the Hilbert space L2.Dynamics of countable number of pursuers and evader expressed as nth-order differential equa-tions with geometric constraints on the control functions of the players.The game terminates at a given fixed time which is denoted by 0.The game's payoff is the infimum of the distances between the evader and pursuers at the time 0.According to the rule of the game,pursuers try to minimise the distance to the evader and the evader tries to maximises it.We found value of the game and constructed players'optimal strategies.展开更多
The typical BDI (belief desire intention) model of agent is not efficiently computable and the strict logic expression is not easily applicable to the AUV (autonomous underwater vehicle) domain with uncertainties. In ...The typical BDI (belief desire intention) model of agent is not efficiently computable and the strict logic expression is not easily applicable to the AUV (autonomous underwater vehicle) domain with uncertainties. In this paper, an AUV fuzzy neural BDI model is proposed. The model is a fuzzy neural network composed of five layers: input ( beliefs and desires) , fuzzification, commitment, fuzzy intention, and defuzzification layer. In the model, the fuzzy commitment rules and neural network are combined to form intentions from beliefs and desires. The model is demonstrated by solving PEG (pursuit-evasion game), and the simulation result is satisfactory.展开更多
In this paper we describe a new reinforcement learning approach based on different states. When the multiagent is in coordination state,we take all coordinative agents as players and choose the learning approach based...In this paper we describe a new reinforcement learning approach based on different states. When the multiagent is in coordination state,we take all coordinative agents as players and choose the learning approach based on game theory. When the multiagent is in indedependent state,we make each agent use the independent learning. We demonstrate that the proposed method on the pursuit-evasion problem can solve the dimension problems induced by both the state and the action space scale exponentially with the number of agents and no convergence problems,and we compare it with other related multiagent learning methods. Simulation experiment results show the feasibility of the algorithm.展开更多
Miss distance is an important parameter of assessing highly maneuvering targets interception. Due to the noise-corrupted measurement and the fact that not all the state variables can be directly measured, the miss dis...Miss distance is an important parameter of assessing highly maneuvering targets interception. Due to the noise-corrupted measurement and the fact that not all the state variables can be directly measured, the miss distance becomes a random variable with a priori unknown distribution. Currently, such a distribution is mainly evaluated by the method of Monte Carlo simulation. In this paper, an analytic approach is obtained in discrete-time controlled system with noise-corrupted state information. The system is subject to a bang-bang control strategy. The analytic distribution is validated through the comparison with Monte Carlo simulation.展开更多
This paper presents a novel evasion guidance law for hypersonic morphing vehicles,focusing on determining the optimized wing's unfolded angle to promote maneuverability based on an intelligent algorithm.First,the ...This paper presents a novel evasion guidance law for hypersonic morphing vehicles,focusing on determining the optimized wing's unfolded angle to promote maneuverability based on an intelligent algorithm.First,the pursuit-evasion problem is modeled as a Markov decision process.And the agent's action consists of maneuver overload and the unfolded angle of wings,which is different from the conventional evasion guidance designed for fixed-shape vehicles.The reward function is formulated to ensure that the miss distances satisfy the prescribed bounds while minimizing energy consumption.Then,to maximize the expected cumulative reward,a residual learning method is proposed based on proximal policy optimization,which integrates the optimal evasion for linear cases as the baseline and trains to optimize the performance for nonlinear engagement with multiple pursuers.Therefore,offline training guarantees improvement of the constructed evasion guidance law over conventional ones.Ultimately,the guidance law for online implementation includes only analytical calculations.It maps from the confrontation state to the expected angle of attack and the unfolded angle while retaining high computational efficiency.Simulations show that the proposed evasion guidance law can utilize the change of unfolded angle to extend the maximum overload capability.And it surpasses conventional maneuver strategies by ensuring better evasion efficacy and higher energy efficiency.展开更多
There are three important roles in evasion conflict: pursuer, target and defender. Pursuers’ mission is to access targets; targets’ mission is to escape from pursuers’ capture; defenders’ mission is to intercept p...There are three important roles in evasion conflict: pursuer, target and defender. Pursuers’ mission is to access targets; targets’ mission is to escape from pursuers’ capture; defenders’ mission is to intercept pursuers who are potentially dangerous to targets. In this paper, a distributed online mission plan(DOMP) algorithm for pursuers is proposed based on fuzzy evaluation and Nash equilibrium. First, an integrated effectiveness evaluation model is given. Then, the details of collaborative mission planning which includes the co-optimization of task distributing, trajectory and corresponding maneuvering scheme are presented. Finally, the convergence and steadiness of DOMP are discussed with simulation results. Compared with centralized mission planning, DOMP is more robust and can greatly improve the effectiveness of pursuing. It can be applied to dynamic scenario due to its distributed architecture.展开更多
Pursuit-evasion games involving mobile robots provide an excellent platform to analyze the performance of pursuit and evasion strategies. Pursuit-evasion has received considerable attention from researchers in the pas...Pursuit-evasion games involving mobile robots provide an excellent platform to analyze the performance of pursuit and evasion strategies. Pursuit-evasion has received considerable attention from researchers in the past few decades due to its application to a broad spectrum of problems that arise in various domains such as defense research, robotics, computer games, drug delivery, cell biology, etc. Several methods have been introduced in the literature to compute the winning chances of a single pursuer or single evader in a two-player game. Over the past few decades, proportional navigation guidance (PNG) based methods have proved to be quite effective for the purpose of pursuit especially for missile navigation and target tracking. However, a performance comparison of these pursuer-centric strategies against recent evader-centric schemes has not been found in the literature, for wheeled mobile robot applications. With a view to understanding the performance of each of the evasion strategies against various pursuit strategies and vice versa, four different proportional navigation-based pursuit schemes have been evaluated against five evader-centric schemes and vice-versa for non-holonomic wheeled mobile robots. The pursuer′s strategies include three well-known schemes namely, augmented ideal proportional navigation guidance (AIPNG), modified AIPNG, angular acceleration guidance (AAG), and a recently introduced pursuer-centric scheme called anticipated trajectory-based proportional navigation guidance (ATPNG). Evader-centric schemes are classic evasion, random motion, optical-flow based evasion, Apollonius circle based evasion and another recently introduced evasion strategy called anticipated velocity based evasion. The performance of each of the pursuit methods was evaluated against five different evasion methods through hardware implementation. The performance was analyzed in terms of time of interception and the distance traveled by players. The working environment was obstacle-free and the maximum velocity of the pursuer was taken to be greater than that of the evader to conclude the game in finite time. It is concluded that ATPNG performs better than other PNG-based schemes, and the anticipated velocity based evasion scheme performs better than the other evasion schemes.展开更多
A defender–attacker–target problem with non-moving target is considered.This problem is modelled by a pursuit-evasion zero-sum differential game with linear dynamics and quadratic cost functional.In this game,the pu...A defender–attacker–target problem with non-moving target is considered.This problem is modelled by a pursuit-evasion zero-sum differential game with linear dynamics and quadratic cost functional.In this game,the pursuer is the defender,while the evader is the attacker.The objective of the pursuer is to minimise the cost functional,while the evader has two objectives:to maximise the cost functional and to keep a given terminal state inequality constraint.The open-loop saddle point solution of this game is obtained in the case where the transfer functions of the controllers for the defender and the attacker are of arbitrary orders.展开更多
基金supported in part by the Strategic Priority Research Program of Chinese Academy of Sciences(XDA27030100)National Natural Science Foundation of China(72293575, 11832001)。
文摘The pursuit-evasion game models the strategic interaction among players, attracting attention in many realistic scenarios, such as missile guidance, unmanned aerial vehicles, and target defense. Existing studies mainly concentrate on the cooperative pursuit of multiple players in two-dimensional pursuit-evasion games. However, these approaches can hardly be applied to practical situations where players usually move in three-dimensional space with a three-degree-of-freedom control. In this paper,we make the first attempt to investigate the equilibrium strategy of the realistic pursuit-evasion game, in which the pursuer follows a three-degree-of-freedom control, and the evader moves freely. First, we describe the pursuer's three-degree-of-freedom control and the evader's relative coordinate. We then rigorously derive the equilibrium strategy by solving the retrogressive path equation according to the Hamilton-Jacobi-Bellman-Isaacs(HJBI) method, which divides the pursuit-evasion process into the navigation and acceleration phases. Besides, we analyze the maximum allowable speed for the pursuer to capture the evader successfully and provide the strategy with which the evader can escape when the pursuer's speed exceeds the threshold. We further conduct comparison tests with various unilateral deviations to verify that the proposed strategy forms a Nash equilibrium.
文摘TheUAV pursuit-evasion problem focuses on the efficient tracking and capture of evading targets using unmanned aerial vehicles(UAVs),which is pivotal in public safety applications,particularly in scenarios involving intrusion monitoring and interception.To address the challenges of data acquisition,real-world deployment,and the limited intelligence of existing algorithms in UAV pursuit-evasion tasks,we propose an innovative swarm intelligencebased UAV pursuit-evasion control framework,namely“Boids Model-based DRL Approach for Pursuit and Escape”(Boids-PE),which synergizes the strengths of swarm intelligence from bio-inspired algorithms and deep reinforcement learning(DRL).The Boids model,which simulates collective behavior through three fundamental rules,separation,alignment,and cohesion,is adopted in our work.By integrating Boids model with the Apollonian Circles algorithm,significant improvements are achieved in capturing UAVs against simple evasion strategies.To further enhance decision-making precision,we incorporate a DRL algorithm to facilitate more accurate strategic planning.We also leverage self-play training to continuously optimize the performance of pursuit UAVs.During experimental evaluation,we meticulously designed both one-on-one and multi-to-one pursuit-evasion scenarios,customizing the state space,action space,and reward function models for each scenario.Extensive simulations,supported by the PyBullet physics engine,validate the effectiveness of our proposed method.The overall results demonstrate that Boids-PE significantly enhance the efficiency and reliability of UAV pursuit-evasion tasks,providing a practical and robust solution for the real-world application of UAV pursuit-evasion missions.
基金supported by the National Defense Science and Technology Innovation program(18-163-15-LZ-001-004-13).
文摘Current successes in artificial intelligence domain have revitalized interest in spacecraft pursuit-evasion game,which is an interception problem with a non-cooperative maneuvering target.The paper presents an automated machine learning(AutoML)based method to generate optimal trajectories in long-distance scenarios.Compared with conventional deep neural network(DNN)methods,the proposed method dramatically reduces the reliance on manual intervention and machine learning expertise.Firstly,based on differential game theory and costate normalization technique,the trajectory optimization problem is formulated under the assumption of continuous thrust.Secondly,the AutoML technique based on sequential model-based optimization(SMBO)framework is introduced to automate DNN design in deep learning process.If recommended DNN architecture exists,the tree-structured Parzen estimator(TPE)is used,otherwise the efficient neural architecture search(NAS)with network morphism is used.Thus,a novel trajectory optimization method with high computational efficiency is achieved.Finally,numerical results demonstrate the feasibility and efficiency of the proposed method.
基金supported by the National Defense Science and Techn ology Innovation(18-163-15-LZ-001-004-13)。
文摘With the development of space rendezvous and proximity operations(RPO)in recent years,the scenarios with noncooperative spacecraft are attracting the attention of more and more researchers.A method based on the costate normalization technique and deep neural networks is presented to generate the optimal guidance law for free-time orbital pursuit-evasion game.Firstly,the 24-dimensional problem given by differential game theory is transformed into a three-parameter optimization problem through the dimension-reduction method which guarantees the uniqueness of solution for the specific scenario.Secondly,a close-loop interactive mechanism involving feedback is introduced to deep neural networks for generating precise initial solution.Thus the optimal guidance law is obtained efficiently and stably with the application of optimization algorithm initialed by the deep neural networks.Finally,the results of the comparison with another two methods and Monte Carlo simulation demonstrate the efficiency and robustness of the proposed optimal guidance method.
文摘Miss distance is a critical parameter of assessing the performance for highly maneuvering targets interception(HMTI). In a realistic terminal guidance system, the control of pursuer depends on the estimate of unknown state, thus the miss distance becomes a random variable with a prior unknown distribution. Currently, such a distribution is mainly evaluated by the method of Monte Carlo simulation. In this paper, by integrating the estimation error model of zero-effort miss distance(ZEM) obtained by our previous work, an analytic method for solving the distribution of miss distance is proposed, in which the system is presumed to use a bang-bang control strategy. By comparing with the results of Monte Carlo simulations under four different types of disturbances(maneuvers), the correctness of the proposed method is validated. Results of this paper provide a powerful tool for the design, analysis and performance evaluation of guidance system.
基金supported,in part,by the National Natural Science Foundation of China(Nos.12272116 and 62088101)the Zhejiang Provincial Natural Science Foundation of China(Nos.LY22A020007 and LR20F030003)+1 种基金the Fundamental Research Funds for the Provincial Universities of Zhejiang,China(Nos.GK239909299001-014)the National Key Basic Research Strengthen Foundation of China(Nos.2021JCJQ-JJ-1183 and 2020-JCJQ-JJ-176)。
文摘This work is inspired by a stealth pursuit behavior called motion camouflage whereby a pursuer approaches an evader while the pursuer camouflages itself against a predetermined background.We formulate the spacecraft pursuit-evasion problem as a stealth pursuit strategy of motion camouflage,in which the pursuer tries to minimize a motion camouflage index defined in this paper.The Euler-Hill reference frame whose origin is set on the circular reference orbit is used to describe the dynamics.Based on the rule of motion camouflage,a guidance strategy in open-loop form to achieve motion camouflage index is derived in which the pursuer lies on the camouflage constraint line connecting the central spacecraft and evader.In order to dispose of the dependence on the evader acceleration in the open-loop guidance strategy,we further consider the motion camouflage pursuit problem within an infinite-horizon nonlinear quadratic differential game.The saddle point solution to the game is derived by using the state-dependent Riccati equation method,and the resulting closed-loop guidance strategy is effective in achieving motion camouflage.Simulations are performed to demonstrate the capabilities of the proposed guidance strategies for the pursuit–evasion game scenario.
基金This study was supported by the Independent Innovation Science Foundation Project of National University of Defense Technology,China(No.22-ZZCX-083).
文摘Qualitative spacecraft pursuit-evasion problem which focuses on feasibility is rarely studied because of high-dimensional dynamics,intractable terminal constraints and heavy computational cost.In this paper,A physics-informed framework is proposed for the problem,providing an intuitive method for spacecraft threat relationship determination,situation assessment,mission feasibility analysis and orbital game rules summarization.For the first time,situation adjustment suggestions can be provided for the weak player in orbital game.First,a dimension-reduction dynamics is derived in the line-of-sight rotation coordinate system and the qualitative model is determined,reducing complexity and avoiding the difficulty of target set presentation caused by individual modeling.Second,the Backwards Reachable Set(BRS)of the target set is used for state space partition and capture zone presentation.Reverse-time analysis can eliminate the influence of changeable initial state and enable the proposed framework to analyze plural situations simultaneously.Third,a time-dependent Hamilton-Jacobi-Isaacs(HJI)Partial Differential Equation(PDE)is established to describe BRS evolution driven by dimension-reduction dynamics,based on level set method.Then,Physics-Informed Neural Networks(PINNs)are extended to HJI PDE final value problem,supporting orbital game rules summarization through capture zone evolution analysis.Finally,numerical results demonstrate the feasibility and efficiency of the proposed framework.
文摘In this paper,the pursuit-evasion game with state and control constraints is solved to achieve the Nash equilibrium of both the pursuer and the evader with an iterative self-play technique.Under the condition where the Hamiltonian formed by means of Pontryagin’s maximum principle has the unique solution,it can be proven that the iterative control law converges to the Nash equilibrium solution.However,the strong nonlinearity of the ordinary differential equations formulated by Pontryagin’s maximum principle makes the control policy difficult to figured out.Moreover the system dynamics employed in this manuscript contains a high dimensional state vector with constraints.In practical applications,such as the control of aircraft,the provided overload is limited.Therefore,in this paper,we consider the optimal strategy of pursuit-evasion games with constant constraint on the control,while some state vectors are restricted by the function of the input.To address the challenges,the optimal control problems are transformed into nonlinear programming problems through the direct collocation method.Finally,two numerical cases of the aircraft pursuit-evasion scenario are given to demonstrate the effectiveness of the presented method to obtain the optimal control of both the pursuer and the evader.
基金Project supported by the National Natural Science Foundation of China(Nos.U1909206,T2121002,61903007,and 11972373)。
文摘For complex functions to emerge in artificial systems,it is important to understand the intrinsic mechanisms of biological swarm behaviors in nature.In this paper,we present a comprehensive survey of pursuit–evasion,which is a critical problem in biological groups.First,we review the problem of pursuit–evasion from three different perspectives:game theory,control theory and artificial intelligence,and bio-inspired perspectives.Then we provide an overview of the research on pursuit–evasion problems in biological systems and artificial systems.We summarize predator pursuit behavior and prey evasion behavior as predator–prey behavior.Next,we analyze the application of pursuit–evasion in artificial systems from three perspectives,i.e.,strong pursuer group vs.weak evader group,weak pursuer group vs.strong evader group,and equal-ability group.Finally,relevant prospects for future pursuit–evasion challenges are discussed.This survey provides new insights into the design of multi-agent and multi-robot systems to complete complex hunting tasks in uncertain dynamic scenarios.
基金the National Natural Science Foundation of China(Grant Nos.11572345&11972044)the Program of National University of Defense Technology(Grant No.ZK18-03-07)。
文摘The orbital pursuit-evasion game is typically formulated as a complete-information game,which assumes the payoff functions of the two players are common knowledge.However,realistic pursuit-evasion games typically have incomplete information,in which the lack of payoff information limits the player’s ability to play optimally.To address this problem,this paper proposes a currently optimal escape strategy based on estimation for the evader.In this strategy,the currently optimal evasive controls are first derived based on the evader’s guess of the pursuer’s payoff weightings.Then an online parameter estimation method based on a modified strong tracking unscented Kalman filter is employed to modify the guess and update the strategy during the game.As the estimation becomes accurate,the currently optimal strategy gets closer to the actually optimal strategy.Simulation results show the proposed strategy can achieve optimal evasive controls progressively and the evader’s payoff of the strategy is lower than that of the zero-sum escape strategy.Meanwhile,the proposed strategy is also effective in the case where the pursuer changes its payoff function halfway during the game.
基金supported by Aeronautical Science Foundation of China(No.20160153002)National Natural Science Foundation of China(No.61933010)+1 种基金Aeronautical Science Foundation of China(No.20180753007)Natural Science Basic Research Plan in Shaanxi Province,China(No.2019JZ-08)。
文摘In practical combat scenario,the cooperative intercept strategies are often carefully designed,and it is challenging for the hypersonic vehicles to achieve successful evasion.Based on the analysis,it can be found that if several Successive Pursuers come from the Same Direction(SPSD)and flight with a proper spacing,the evasion difficulty may increase greatly.To address this problem,we focus on the evasion guidance strategy design for the Air-breathing Hypersonic Vehicles(AHVs)under the SPSD combat scenario.In order to avoid the induced influence on the scramjet,altitude and speed of the vehicle,the lateral maneuver and evasion are employed.To guarantee the remnant maneuver ability,the concept of specified miss distance is introduced and utilized to generate the guidance command for the AHV.In the framework of constrained optimal control,the analytical expression of the evasion command is derived,and the constraints of the overload can be ensured to be never violated.In fact,by analyzing the spacing of the pursers,it can be classified whether the cooperative pursuit is formed.For the coordination-unformed multiple pursers,the evasion can be achieved lightly by the proposed strategy.If the coordination is formed,the proposed method will generate a large reverse direction maneuver,and the successful evasion can be maintained as a result.The performance of the proposed algorithms is tested in numerical simulations.
文摘In this paper,we study a pursuit-evasion differential game problem in the Hilbert space L2.Dynamics of countable number of pursuers and evader expressed as nth-order differential equa-tions with geometric constraints on the control functions of the players.The game terminates at a given fixed time which is denoted by 0.The game's payoff is the infimum of the distances between the evader and pursuers at the time 0.According to the rule of the game,pursuers try to minimise the distance to the evader and the evader tries to maximises it.We found value of the game and constructed players'optimal strategies.
文摘The typical BDI (belief desire intention) model of agent is not efficiently computable and the strict logic expression is not easily applicable to the AUV (autonomous underwater vehicle) domain with uncertainties. In this paper, an AUV fuzzy neural BDI model is proposed. The model is a fuzzy neural network composed of five layers: input ( beliefs and desires) , fuzzification, commitment, fuzzy intention, and defuzzification layer. In the model, the fuzzy commitment rules and neural network are combined to form intentions from beliefs and desires. The model is demonstrated by solving PEG (pursuit-evasion game), and the simulation result is satisfactory.
文摘In this paper we describe a new reinforcement learning approach based on different states. When the multiagent is in coordination state,we take all coordinative agents as players and choose the learning approach based on game theory. When the multiagent is in indedependent state,we make each agent use the independent learning. We demonstrate that the proposed method on the pursuit-evasion problem can solve the dimension problems induced by both the state and the action space scale exponentially with the number of agents and no convergence problems,and we compare it with other related multiagent learning methods. Simulation experiment results show the feasibility of the algorithm.
文摘Miss distance is an important parameter of assessing highly maneuvering targets interception. Due to the noise-corrupted measurement and the fact that not all the state variables can be directly measured, the miss distance becomes a random variable with a priori unknown distribution. Currently, such a distribution is mainly evaluated by the method of Monte Carlo simulation. In this paper, an analytic approach is obtained in discrete-time controlled system with noise-corrupted state information. The system is subject to a bang-bang control strategy. The analytic distribution is validated through the comparison with Monte Carlo simulation.
基金This work was supported by the National Natural Science Foundation of China(No.52202438).
文摘This paper presents a novel evasion guidance law for hypersonic morphing vehicles,focusing on determining the optimized wing's unfolded angle to promote maneuverability based on an intelligent algorithm.First,the pursuit-evasion problem is modeled as a Markov decision process.And the agent's action consists of maneuver overload and the unfolded angle of wings,which is different from the conventional evasion guidance designed for fixed-shape vehicles.The reward function is formulated to ensure that the miss distances satisfy the prescribed bounds while minimizing energy consumption.Then,to maximize the expected cumulative reward,a residual learning method is proposed based on proximal policy optimization,which integrates the optimal evasion for linear cases as the baseline and trains to optimize the performance for nonlinear engagement with multiple pursuers.Therefore,offline training guarantees improvement of the constructed evasion guidance law over conventional ones.Ultimately,the guidance law for online implementation includes only analytical calculations.It maps from the confrontation state to the expected angle of attack and the unfolded angle while retaining high computational efficiency.Simulations show that the proposed evasion guidance law can utilize the change of unfolded angle to extend the maximum overload capability.And it surpasses conventional maneuver strategies by ensuring better evasion efficacy and higher energy efficiency.
基金co-supported by the Heilongjiang Postdoctoral Scientific Research Developmental Fund (No. LBH-Q14054)the Fundamental Research Funds for the Central Universities of China (No. HEUCFD1503)
文摘There are three important roles in evasion conflict: pursuer, target and defender. Pursuers’ mission is to access targets; targets’ mission is to escape from pursuers’ capture; defenders’ mission is to intercept pursuers who are potentially dangerous to targets. In this paper, a distributed online mission plan(DOMP) algorithm for pursuers is proposed based on fuzzy evaluation and Nash equilibrium. First, an integrated effectiveness evaluation model is given. Then, the details of collaborative mission planning which includes the co-optimization of task distributing, trajectory and corresponding maneuvering scheme are presented. Finally, the convergence and steadiness of DOMP are discussed with simulation results. Compared with centralized mission planning, DOMP is more robust and can greatly improve the effectiveness of pursuing. It can be applied to dynamic scenario due to its distributed architecture.
文摘Pursuit-evasion games involving mobile robots provide an excellent platform to analyze the performance of pursuit and evasion strategies. Pursuit-evasion has received considerable attention from researchers in the past few decades due to its application to a broad spectrum of problems that arise in various domains such as defense research, robotics, computer games, drug delivery, cell biology, etc. Several methods have been introduced in the literature to compute the winning chances of a single pursuer or single evader in a two-player game. Over the past few decades, proportional navigation guidance (PNG) based methods have proved to be quite effective for the purpose of pursuit especially for missile navigation and target tracking. However, a performance comparison of these pursuer-centric strategies against recent evader-centric schemes has not been found in the literature, for wheeled mobile robot applications. With a view to understanding the performance of each of the evasion strategies against various pursuit strategies and vice versa, four different proportional navigation-based pursuit schemes have been evaluated against five evader-centric schemes and vice-versa for non-holonomic wheeled mobile robots. The pursuer′s strategies include three well-known schemes namely, augmented ideal proportional navigation guidance (AIPNG), modified AIPNG, angular acceleration guidance (AAG), and a recently introduced pursuer-centric scheme called anticipated trajectory-based proportional navigation guidance (ATPNG). Evader-centric schemes are classic evasion, random motion, optical-flow based evasion, Apollonius circle based evasion and another recently introduced evasion strategy called anticipated velocity based evasion. The performance of each of the pursuit methods was evaluated against five different evasion methods through hardware implementation. The performance was analyzed in terms of time of interception and the distance traveled by players. The working environment was obstacle-free and the maximum velocity of the pursuer was taken to be greater than that of the evader to conclude the game in finite time. It is concluded that ATPNG performs better than other PNG-based schemes, and the anticipated velocity based evasion scheme performs better than the other evasion schemes.
文摘A defender–attacker–target problem with non-moving target is considered.This problem is modelled by a pursuit-evasion zero-sum differential game with linear dynamics and quadratic cost functional.In this game,the pursuer is the defender,while the evader is the attacker.The objective of the pursuer is to minimise the cost functional,while the evader has two objectives:to maximise the cost functional and to keep a given terminal state inequality constraint.The open-loop saddle point solution of this game is obtained in the case where the transfer functions of the controllers for the defender and the attacker are of arbitrary orders.