Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm opt...Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.展开更多
Game theory can be applied to the air combat decision-making problem of multiple unmanned combat air vehicles(UCAVs).However,it is difficult to have satisfactory decision-making results completely relying on air comba...Game theory can be applied to the air combat decision-making problem of multiple unmanned combat air vehicles(UCAVs).However,it is difficult to have satisfactory decision-making results completely relying on air combat situation information,because there is a lot of time-sensitive information in a complex air combat environment.In this paper,a constraint strategy game approach is developed to generate intelligent decision-making for multiple UCAVs in complex air combat environment with air combat situation information and time-sensitive information.Initially,a constraint strategy game is employed to model attack-defense decision-making problem in complex air combat environment.Then,an algorithm is proposed for solving the constraint strategy game based on linear programming and linear inequality(CSG-LL).Finally,an example is given to illustrate the effectiveness of the proposed approach.展开更多
Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Crit...Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.展开更多
In this paper, a static weapon target assignment(WTA)problem is studied. As a critical problem in cooperative air combat,outcome of WTA directly influences the battle. Along with the cost of weapons rising rapidly, ...In this paper, a static weapon target assignment(WTA)problem is studied. As a critical problem in cooperative air combat,outcome of WTA directly influences the battle. Along with the cost of weapons rising rapidly, it is indispensable to design a target assignment model that can ensure minimizing targets survivability and weapons consumption simultaneously. Afterwards an algorithm named as improved artificial fish swarm algorithm-improved harmony search algorithm(IAFSA-IHS) is proposed to solve the problem. The effect of the proposed algorithm is demonstrated in numerical simulations, and results show that it performs positively in searching the optimal solution and solving the WTA problem.展开更多
Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The a...Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The autonomous maneuvering model ofUAV is established byMarkovDecision Process.The Twin DelayedDeep Deterministic Policy Gradient(TD3)algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm in deep reinforcement learning are used to train the model,and the experimental results of the two algorithms are analyzed and compared.The simulation experiment results show that compared with the DDPG algorithm,the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems.The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position,speed,and relative azimuth,adjust their actions to approach,and successfully strike the enemy,providing a new method for UAVs to make intelligent maneuvering decisions during air combat.展开更多
At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that targe...At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that target penetrates the defended area along any flight path is established by the state analysis and statistical equilibrium analysis of stochastic service system theory. The simulated annealing algorithm is an enlightening random search method based on Monte Carlo recursion, and it can find global optimal solution by simulating annealing process. Combining stochastic model to compete the probability and simulated annealing algorithm, this paper establishes the method to solve problem quantitatively about combat configuration optimization of weapon systems. The calculated result shows that the perfect configuration for fire cells of the weapon is fast found by using this method, and this quantificational method for combat configuration is faster and more scientific than previous one based on principle via map fire field.展开更多
Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance ...Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.展开更多
In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried ou...In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation.展开更多
Target distribution in cooperative combat is a difficult and emphases. We build up the optimization model according to the rule of fire distribution. We have researched on the optimization model with BOA. The BOA can ...Target distribution in cooperative combat is a difficult and emphases. We build up the optimization model according to the rule of fire distribution. We have researched on the optimization model with BOA. The BOA can estimate the joint probability distribution of the variables with Bayesian network, and the new candidate solutions also can be generated by the joint distribution. The simulation example verified that the method could be used to solve the complex question, the operation was quickly and the solution was best.展开更多
The dynamic weapon target assignment(DWTA)problem is of great significance in modern air combat.However,DWTA is a highly complex constrained multi-objective combinatorial optimization problem.An improved elitist non-d...The dynamic weapon target assignment(DWTA)problem is of great significance in modern air combat.However,DWTA is a highly complex constrained multi-objective combinatorial optimization problem.An improved elitist non-dominated sorting genetic algorithm-II(NSGA-II)called the non-dominated shuffled frog leaping algorithm(NSFLA)is proposed to maximize damage to enemy targets and minimize the self-threat in air combat constraints.In NSFLA,the shuffled frog leaping algorithm(SFLA)is introduced to NSGA-II to replace the inside evolutionary scheme of the genetic algorithm(GA),displaying low optimization speed and heterogeneous space search defects.Two improvements have also been raised to promote the internal optimization performance of SFLA.Firstly,the local evolution scheme,a novel crossover mechanism,ensures that each individual participates in updating instead of only the worst ones,which can expand the diversity of the population.Secondly,a discrete adaptive mutation algorithm based on the function change rate is applied to balance the global and local search.Finally,the scheme is verified in various air combat scenarios.The results show that the proposed NSFLA has apparent advantages in solution quality and efficiency,especially in many aircraft and the dynamic air combat environment.展开更多
In order to adapt to the changing battlefield situation and improve the combat effectiveness of air combat,the problem of air battle allocation based on Bayesian optimization algorithm(BOA)is studied.First,we discuss ...In order to adapt to the changing battlefield situation and improve the combat effectiveness of air combat,the problem of air battle allocation based on Bayesian optimization algorithm(BOA)is studied.First,we discuss the number of fighters on both sides,and apply cluster analysis to divide our fighter into the same number of groups as the enemy.On this basis,we sort each of our fighters'different advantages to the enemy fighters,and obtain a series of target allocation schemes for enemy attacks by first in first serviced criteria.Finally,the maximum advantage function is used as the target,and the BOA is used to optimize the model.The simulation results show that the established model has certain decision-making ability,and the BOA can converge to the global optimal solution at a faster speed,which can effectively solve the air combat task assignment problem.展开更多
This paper presents a rule-based framework for addressing decision-making problems within the context of the "UI-STRIVE"Competition.First,two distinct autonomous confrontation scenarios are described:autonom...This paper presents a rule-based framework for addressing decision-making problems within the context of the "UI-STRIVE"Competition.First,two distinct autonomous confrontation scenarios are described:autonomous air combat and cooperative interception.Second,a State-Event-Condition-Action(SECA)decision-making framework is developed,which integrates thefinite state machine and event-condition-action frameworks.This framework provides three products to describe rules,i.e.the SECA model,the SECA state chart,and the SECA rule description.Third,the situation assessment and target assignment during autonomous air combat are investigated,and the mathematical models are established.Finally,the decisionmaking model's rationality and feasibility are verified through data simulation and analysis.展开更多
In order to achieve the optimal attack outcome in the air combat under the beyond visual range(BVR)condition,the decision-making(DM)problem which is to set a proper assignment for the friendly fighters on the hostile ...In order to achieve the optimal attack outcome in the air combat under the beyond visual range(BVR)condition,the decision-making(DM)problem which is to set a proper assignment for the friendly fighters on the hostile fighters is the most crucial task for cooperative multiple target attack(CMTA).In this paper,a heuristic quantum genetic algorithm(HQGA)is proposed to solve the DM problem.The originality of our work can be supported in the following aspects:(1)the HQGA assigns all hostile fighters to every missile rather than fighters so that the HQGA can encode chromosomes with quantum bits(Q-bits);(2)the relative successful sequence probability(RSSP)is defined,based on which the priority attack vector is constructed;(3)the HQGA can heuristically modify quantum chromosomes according to modification technique proposed in this paper;(4)last but not the least,in some special conditions,the HQGA gets rid of the constraint described by other algorithms that to obtain a better result.In the end of this paper,two examples are illustrated to show that the HQGA has its own advantage over other algorithms when dealing with the DM problem in the context of CMTA.展开更多
文摘Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.
基金supported by Major Projects for Science and Technology Innovation 2030(Grant No.2018AA0100800)Equipment Pre-research Foundation of Laboratory(Grant No.61425040104)in part by Jiangsu Province“333”project under Grant BRA2019051.
文摘Game theory can be applied to the air combat decision-making problem of multiple unmanned combat air vehicles(UCAVs).However,it is difficult to have satisfactory decision-making results completely relying on air combat situation information,because there is a lot of time-sensitive information in a complex air combat environment.In this paper,a constraint strategy game approach is developed to generate intelligent decision-making for multiple UCAVs in complex air combat environment with air combat situation information and time-sensitive information.Initially,a constraint strategy game is employed to model attack-defense decision-making problem in complex air combat environment.Then,an algorithm is proposed for solving the constraint strategy game based on linear programming and linear inequality(CSG-LL).Finally,an example is given to illustrate the effectiveness of the proposed approach.
基金National Natural Science Foundation of China,Grant/Award Number:62003267Fundamental Research Funds for the Central Universities,Grant/Award Number:G2022KY0602+1 种基金Technology on Electromagnetic Space Operations and Applications Laboratory,Grant/Award Number:2022ZX0090Key Core Technology Research Plan of Xi'an,Grant/Award Number:21RGZN0016。
文摘Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.
基金supported by the National Natural Science Foundation of China(61472441)
文摘In this paper, a static weapon target assignment(WTA)problem is studied. As a critical problem in cooperative air combat,outcome of WTA directly influences the battle. Along with the cost of weapons rising rapidly, it is indispensable to design a target assignment model that can ensure minimizing targets survivability and weapons consumption simultaneously. Afterwards an algorithm named as improved artificial fish swarm algorithm-improved harmony search algorithm(IAFSA-IHS) is proposed to solve the problem. The effect of the proposed algorithm is demonstrated in numerical simulations, and results show that it performs positively in searching the optimal solution and solving the WTA problem.
基金acknowledge National Natural Science Foundation of China(Grant No.61573285,No.62003267)Open Fund of Key Laboratory of Data Link Technology of China Electronics Technology Group Corporation(Grant No.CLDL-20182101)Natural Science Foundation of Shaanxi Province(Grant No.2020JQ220)to provide fund for conducting experiments.
文摘Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The autonomous maneuvering model ofUAV is established byMarkovDecision Process.The Twin DelayedDeep Deterministic Policy Gradient(TD3)algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm in deep reinforcement learning are used to train the model,and the experimental results of the two algorithms are analyzed and compared.The simulation experiment results show that compared with the DDPG algorithm,the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems.The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position,speed,and relative azimuth,adjust their actions to approach,and successfully strike the enemy,providing a new method for UAVs to make intelligent maneuvering decisions during air combat.
文摘At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that target penetrates the defended area along any flight path is established by the state analysis and statistical equilibrium analysis of stochastic service system theory. The simulated annealing algorithm is an enlightening random search method based on Monte Carlo recursion, and it can find global optimal solution by simulating annealing process. Combining stochastic model to compete the probability and simulated annealing algorithm, this paper establishes the method to solve problem quantitatively about combat configuration optimization of weapon systems. The calculated result shows that the perfect configuration for fire cells of the weapon is fast found by using this method, and this quantificational method for combat configuration is faster and more scientific than previous one based on principle via map fire field.
基金co-supported by the National Natural Science Foundation of China(No.52272382)the Aeronautical Science Foundation of China(No.20200017051001)the Fundamental Research Funds for the Central Universities,China.
文摘Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.
基金supported by the Aeronautical Science Foundation of China(2017ZC53033)the Seed Foundation of Innovation and Creation for Graduate Students in Northwestern Polytechnical University(CX2020156)。
文摘In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation.
基金This project was supported by the Fund of College Doctor Degree (20020699009)
文摘Target distribution in cooperative combat is a difficult and emphases. We build up the optimization model according to the rule of fire distribution. We have researched on the optimization model with BOA. The BOA can estimate the joint probability distribution of the variables with Bayesian network, and the new candidate solutions also can be generated by the joint distribution. The simulation example verified that the method could be used to solve the complex question, the operation was quickly and the solution was best.
基金supported by the National Natural Science Foundation of China(61673209,71971115)。
文摘The dynamic weapon target assignment(DWTA)problem is of great significance in modern air combat.However,DWTA is a highly complex constrained multi-objective combinatorial optimization problem.An improved elitist non-dominated sorting genetic algorithm-II(NSGA-II)called the non-dominated shuffled frog leaping algorithm(NSFLA)is proposed to maximize damage to enemy targets and minimize the self-threat in air combat constraints.In NSFLA,the shuffled frog leaping algorithm(SFLA)is introduced to NSGA-II to replace the inside evolutionary scheme of the genetic algorithm(GA),displaying low optimization speed and heterogeneous space search defects.Two improvements have also been raised to promote the internal optimization performance of SFLA.Firstly,the local evolution scheme,a novel crossover mechanism,ensures that each individual participates in updating instead of only the worst ones,which can expand the diversity of the population.Secondly,a discrete adaptive mutation algorithm based on the function change rate is applied to balance the global and local search.Finally,the scheme is verified in various air combat scenarios.The results show that the proposed NSFLA has apparent advantages in solution quality and efficiency,especially in many aircraft and the dynamic air combat environment.
基金the National Natural Science Foundation of China(No.61074090)。
文摘In order to adapt to the changing battlefield situation and improve the combat effectiveness of air combat,the problem of air battle allocation based on Bayesian optimization algorithm(BOA)is studied.First,we discuss the number of fighters on both sides,and apply cluster analysis to divide our fighter into the same number of groups as the enemy.On this basis,we sort each of our fighters'different advantages to the enemy fighters,and obtain a series of target allocation schemes for enemy attacks by first in first serviced criteria.Finally,the maximum advantage function is used as the target,and the BOA is used to optimize the model.The simulation results show that the established model has certain decision-making ability,and the BOA can converge to the global optimal solution at a faster speed,which can effectively solve the air combat task assignment problem.
文摘This paper presents a rule-based framework for addressing decision-making problems within the context of the "UI-STRIVE"Competition.First,two distinct autonomous confrontation scenarios are described:autonomous air combat and cooperative interception.Second,a State-Event-Condition-Action(SECA)decision-making framework is developed,which integrates thefinite state machine and event-condition-action frameworks.This framework provides three products to describe rules,i.e.the SECA model,the SECA state chart,and the SECA rule description.Third,the situation assessment and target assignment during autonomous air combat are investigated,and the mathematical models are established.Finally,the decisionmaking model's rationality and feasibility are verified through data simulation and analysis.
基金supported by National Nature Science Foundation of China,and the supporting project is“Study on parallel intelligent optimization simulation with combination of qualitative and quantitative method”(61004089)supported by the Graduate Student Innovation Practice Foundation of Beihang University in China(YCSJ-01-201205),which is“Research of an efficient and intelligent optimization method and application in aircraft shape design”.
文摘In order to achieve the optimal attack outcome in the air combat under the beyond visual range(BVR)condition,the decision-making(DM)problem which is to set a proper assignment for the friendly fighters on the hostile fighters is the most crucial task for cooperative multiple target attack(CMTA).In this paper,a heuristic quantum genetic algorithm(HQGA)is proposed to solve the DM problem.The originality of our work can be supported in the following aspects:(1)the HQGA assigns all hostile fighters to every missile rather than fighters so that the HQGA can encode chromosomes with quantum bits(Q-bits);(2)the relative successful sequence probability(RSSP)is defined,based on which the priority attack vector is constructed;(3)the HQGA can heuristically modify quantum chromosomes according to modification technique proposed in this paper;(4)last but not the least,in some special conditions,the HQGA gets rid of the constraint described by other algorithms that to obtain a better result.In the end of this paper,two examples are illustrated to show that the HQGA has its own advantage over other algorithms when dealing with the DM problem in the context of CMTA.