Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance ...Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.展开更多
Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm opt...Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.展开更多
Game theory can be applied to the air combat decision-making problem of multiple unmanned combat air vehicles(UCAVs).However,it is difficult to have satisfactory decision-making results completely relying on air comba...Game theory can be applied to the air combat decision-making problem of multiple unmanned combat air vehicles(UCAVs).However,it is difficult to have satisfactory decision-making results completely relying on air combat situation information,because there is a lot of time-sensitive information in a complex air combat environment.In this paper,a constraint strategy game approach is developed to generate intelligent decision-making for multiple UCAVs in complex air combat environment with air combat situation information and time-sensitive information.Initially,a constraint strategy game is employed to model attack-defense decision-making problem in complex air combat environment.Then,an algorithm is proposed for solving the constraint strategy game based on linear programming and linear inequality(CSG-LL).Finally,an example is given to illustrate the effectiveness of the proposed approach.展开更多
A decision-making problem of missile-target assignment with a novel particle swarm optimization algorithm is proposed when it comes to a multiple target collaborative combat situation.The threat function is establishe...A decision-making problem of missile-target assignment with a novel particle swarm optimization algorithm is proposed when it comes to a multiple target collaborative combat situation.The threat function is established to describe air combat situation.Optimization function is used to find an optimal missile-target assignment.An improved particle swarm optimization algorithm is utilized to figure out the optimization function with less parameters,which is based on the adaptive random learning approach.According to the coordinated attack tactics,there are some adjustments to the assignment.Simulation example results show that it is an effective algorithm to handle with the decision-making problem of the missile-target assignment(MTA)in air combat.展开更多
Optimal formation reconfiguration control of multiple Uninhabited Combat Air Vehicles (UCAVs) is a complicated global optimum problem. Particle Swarm Optimization (PSO) is a population based stochastic optimizatio...Optimal formation reconfiguration control of multiple Uninhabited Combat Air Vehicles (UCAVs) is a complicated global optimum problem. Particle Swarm Optimization (PSO) is a population based stochastic optimization technique inspired by social behaviour of bird flocking or fish schooling. PSO can achieve better results in a faster, cheaper way compared with other bio-inspired computational methods, and there are few parameters to adjust in PSO. In this paper, we propose an improved PSO model for solving the optimal formation reconfiguration control problem for multiple UCAVs. Firstly, the Control Parameterization and Time Diseretization (CPTD) method is designed in detail. Then, the mutation strategy and a special mutation-escape operator are adopted in the improved PSO model to make particles explore the search space more efficiently. The proposed strategy can produce a large speed value dynamically according to the variation of the speed, which makes the algorithm explore the local and global minima thoroughly at the same time. Series experimental results demonstrate the feasibility and effectiveness of the proposed method in solving the optimal formation reconfiguration control problem for multiple UCAVs.展开更多
Based on effectiveness analysis , a novel method is presented for combat aircraft top-hierarchy concept evaluation and decision-making. Applying multi-criterion decision-making ( MCDM ) and analytic hierarchy process ...Based on effectiveness analysis , a novel method is presented for combat aircraft top-hierarchy concept evaluation and decision-making. Applying multi-criterion decision-making ( MCDM ) and analytic hierarchy process , the new method can help to overcome the limitations of existing evaluation systems and decision-make methods.The proposed method includes the following process :( 1 ) Establish a multi-criterion and multi-hierarchy evaluation attribute system by introducing combat effectiveness ;( 2 ) Assign weight to the attributes and normalize them ;( 3 ) Evaluate and decision-make top-hierarchy aircraft concept based on effectiveness to reach a satisfactory design by comprehensively applying four multi-criterion decision-making methodologies , i.e.grey correlation projection method , weighted summation method , weighted quadrature method and ideal solution decision-making method , while considering the attribute hierarchy system and the logical relations among the attributes.Finally , an example is given to indicate the validity and feasibility of the proposed method.展开更多
The threat sequencing of multiple unmanned combat air vehicles(UCAVs) is a multi-attribute decision-making(MADM)problem. In the threat sequencing process of multiple UCAVs,due to the strong confrontation and high dyna...The threat sequencing of multiple unmanned combat air vehicles(UCAVs) is a multi-attribute decision-making(MADM)problem. In the threat sequencing process of multiple UCAVs,due to the strong confrontation and high dynamics of the air combat environment, the weight coefficients of the threat indicators are usually time-varying. Moreover, the air combat data is difficult to be obtained accurately. In this study, a threat sequencing method of multiple UCAVs is proposed based on game theory by considering the incomplete information. Firstly, a zero-sum game model of decision maker( D) and nature(N)with fuzzy payoffs is established to obtain the uncertain parameters which are the weight coefficient parameters of the threat indicators and the interval parameters of the threat matrix. Then,the established zero-sum game with fuzzy payoffs is transformed into a zero-sum game with crisp payoffs(matrix game) to solve. Moreover, a decision rule is addressed for the threat sequencing problem of multiple UCAVs based on the obtained uncertain parameters. Finally, numerical simulation results are presented to show the effectiveness of the proposed approach.展开更多
Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The a...Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The autonomous maneuvering model ofUAV is established byMarkovDecision Process.The Twin DelayedDeep Deterministic Policy Gradient(TD3)algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm in deep reinforcement learning are used to train the model,and the experimental results of the two algorithms are analyzed and compared.The simulation experiment results show that compared with the DDPG algorithm,the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems.The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position,speed,and relative azimuth,adjust their actions to approach,and successfully strike the enemy,providing a new method for UAVs to make intelligent maneuvering decisions during air combat.展开更多
In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried ou...In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation.展开更多
Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles(UCAVs)for cooperative occupation dur...Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles(UCAVs)for cooperative occupation during beyond-visual-range(BVR)air combat.However,prior research on occupational decision-making in BVR air combat has mostly been limited to one-on-one scenarios.As such,this study presents a practical cooperative occupation decision-making methodology for use with multiple UCAVs.The weapon engagement zone(WEZ)and combat geometry were first used to develop an advantage function for situational assessment of one-on-one engagement.An encircling advantage function was then designed to represent the cooperation of UCAVs,thereby establishing a cooperative occupation model.The corresponding objective function was derived from the one-on-one engagement advantage function and the encircling advantage function.The resulting model exhibited similarities to a mixed-integer nonlinear programming(MINLP)problem.As such,an improved discrete particle swarm optimization(DPSO)algorithm was used to identify a solution.The occupation process was then converted into a formation switching task as part of the cooperative occupation model.A series of simulations were conducted to verify occupational solutions in varying situations,including two-on-two engagement.Simulated results showed these solutions varied with initial conditions and weighting coefficients.This occupation process,based on formation switching,effectively demonstrates the viability of the proposed technique.These cooperative occupation results could provide a theoretical framework for subsequent research in cooperative BVR air combat.展开更多
This paper presents a rule-based framework for addressing decision-making problems within the context of the\UI-STRIVE"Competition.First,two distinct autonomous confrontation scenarios are described:autonomous ai...This paper presents a rule-based framework for addressing decision-making problems within the context of the\UI-STRIVE"Competition.First,two distinct autonomous confrontation scenarios are described:autonomous air combat and cooperative interception.Second,a State-Event-Condition-Action(SECA)decision-making framework is developed,which integrates thefinite state machine and event-condition-action frameworks.This framework provides three products to describe rules,i.e.the SECA model,the SECA state chart,and the SECA rule description.Third,the situation assessment and target assignment during autonomous air combat are investigated,and the mathematical models are established.Finally,the decisionmaking model's rationality and feasibility are verified through data simulation and analysis.展开更多
基金co-supported by the National Natural Science Foundation of China(No.52272382)the Aeronautical Science Foundation of China(No.20200017051001)the Fundamental Research Funds for the Central Universities,China.
文摘Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.
文摘Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.
基金supported by Major Projects for Science and Technology Innovation 2030(Grant No.2018AA0100800)Equipment Pre-research Foundation of Laboratory(Grant No.61425040104)in part by Jiangsu Province“333”project under Grant BRA2019051.
文摘Game theory can be applied to the air combat decision-making problem of multiple unmanned combat air vehicles(UCAVs).However,it is difficult to have satisfactory decision-making results completely relying on air combat situation information,because there is a lot of time-sensitive information in a complex air combat environment.In this paper,a constraint strategy game approach is developed to generate intelligent decision-making for multiple UCAVs in complex air combat environment with air combat situation information and time-sensitive information.Initially,a constraint strategy game is employed to model attack-defense decision-making problem in complex air combat environment.Then,an algorithm is proposed for solving the constraint strategy game based on linear programming and linear inequality(CSG-LL).Finally,an example is given to illustrate the effectiveness of the proposed approach.
基金jointly granted by the Science and Technology on Avionics Integration Laboratory and the Aeronautical Science Foundation of China (No. 2016ZC15008)
文摘A decision-making problem of missile-target assignment with a novel particle swarm optimization algorithm is proposed when it comes to a multiple target collaborative combat situation.The threat function is established to describe air combat situation.Optimization function is used to find an optimal missile-target assignment.An improved particle swarm optimization algorithm is utilized to figure out the optimization function with less parameters,which is based on the adaptive random learning approach.According to the coordinated attack tactics,there are some adjustments to the assignment.Simulation example results show that it is an effective algorithm to handle with the decision-making problem of the missile-target assignment(MTA)in air combat.
基金supported by the Natural Science Foundation of China (Grant No.60604009)the Aero-nautical Science Foundation of China (Grant No. 2006ZC51039)+1 种基金the Beijing NOVA Program Foundation of China (Grant No. 2007A017)the Open Fund of the Provincial Key Laboratory for Information Proc-essing Technology, Suzhou University (Grant No. KJS0821)
文摘Optimal formation reconfiguration control of multiple Uninhabited Combat Air Vehicles (UCAVs) is a complicated global optimum problem. Particle Swarm Optimization (PSO) is a population based stochastic optimization technique inspired by social behaviour of bird flocking or fish schooling. PSO can achieve better results in a faster, cheaper way compared with other bio-inspired computational methods, and there are few parameters to adjust in PSO. In this paper, we propose an improved PSO model for solving the optimal formation reconfiguration control problem for multiple UCAVs. Firstly, the Control Parameterization and Time Diseretization (CPTD) method is designed in detail. Then, the mutation strategy and a special mutation-escape operator are adopted in the improved PSO model to make particles explore the search space more efficiently. The proposed strategy can produce a large speed value dynamically according to the variation of the speed, which makes the algorithm explore the local and global minima thoroughly at the same time. Series experimental results demonstrate the feasibility and effectiveness of the proposed method in solving the optimal formation reconfiguration control problem for multiple UCAVs.
文摘Based on effectiveness analysis , a novel method is presented for combat aircraft top-hierarchy concept evaluation and decision-making. Applying multi-criterion decision-making ( MCDM ) and analytic hierarchy process , the new method can help to overcome the limitations of existing evaluation systems and decision-make methods.The proposed method includes the following process :( 1 ) Establish a multi-criterion and multi-hierarchy evaluation attribute system by introducing combat effectiveness ;( 2 ) Assign weight to the attributes and normalize them ;( 3 ) Evaluate and decision-make top-hierarchy aircraft concept based on effectiveness to reach a satisfactory design by comprehensively applying four multi-criterion decision-making methodologies , i.e.grey correlation projection method , weighted summation method , weighted quadrature method and ideal solution decision-making method , while considering the attribute hierarchy system and the logical relations among the attributes.Finally , an example is given to indicate the validity and feasibility of the proposed method.
基金supported by the Major Projects for Science and Technology Innovation 2030 (2018AAA0100805)。
文摘The threat sequencing of multiple unmanned combat air vehicles(UCAVs) is a multi-attribute decision-making(MADM)problem. In the threat sequencing process of multiple UCAVs,due to the strong confrontation and high dynamics of the air combat environment, the weight coefficients of the threat indicators are usually time-varying. Moreover, the air combat data is difficult to be obtained accurately. In this study, a threat sequencing method of multiple UCAVs is proposed based on game theory by considering the incomplete information. Firstly, a zero-sum game model of decision maker( D) and nature(N)with fuzzy payoffs is established to obtain the uncertain parameters which are the weight coefficient parameters of the threat indicators and the interval parameters of the threat matrix. Then,the established zero-sum game with fuzzy payoffs is transformed into a zero-sum game with crisp payoffs(matrix game) to solve. Moreover, a decision rule is addressed for the threat sequencing problem of multiple UCAVs based on the obtained uncertain parameters. Finally, numerical simulation results are presented to show the effectiveness of the proposed approach.
基金acknowledge National Natural Science Foundation of China(Grant No.61573285,No.62003267)Open Fund of Key Laboratory of Data Link Technology of China Electronics Technology Group Corporation(Grant No.CLDL-20182101)Natural Science Foundation of Shaanxi Province(Grant No.2020JQ220)to provide fund for conducting experiments.
文摘Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The autonomous maneuvering model ofUAV is established byMarkovDecision Process.The Twin DelayedDeep Deterministic Policy Gradient(TD3)algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm in deep reinforcement learning are used to train the model,and the experimental results of the two algorithms are analyzed and compared.The simulation experiment results show that compared with the DDPG algorithm,the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems.The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position,speed,and relative azimuth,adjust their actions to approach,and successfully strike the enemy,providing a new method for UAVs to make intelligent maneuvering decisions during air combat.
基金supported by the Aeronautical Science Foundation of China(2017ZC53033)the Seed Foundation of Innovation and Creation for Graduate Students in Northwestern Polytechnical University(CX2020156)。
文摘In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation.
基金supported by the National Natural Science Foundation of China(No.61573286)the Aeronautical Science Foundation of China(No.20180753006)+2 种基金the Fundamental Research Funds for the Central Universities(3102019ZDHKY07)the Natural Science Foundation of Shaanxi Province(2020JQ-218)the Shaanxi Province Key Laboratory of Flight Control and Simulation Technology。
文摘Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles(UCAVs)for cooperative occupation during beyond-visual-range(BVR)air combat.However,prior research on occupational decision-making in BVR air combat has mostly been limited to one-on-one scenarios.As such,this study presents a practical cooperative occupation decision-making methodology for use with multiple UCAVs.The weapon engagement zone(WEZ)and combat geometry were first used to develop an advantage function for situational assessment of one-on-one engagement.An encircling advantage function was then designed to represent the cooperation of UCAVs,thereby establishing a cooperative occupation model.The corresponding objective function was derived from the one-on-one engagement advantage function and the encircling advantage function.The resulting model exhibited similarities to a mixed-integer nonlinear programming(MINLP)problem.As such,an improved discrete particle swarm optimization(DPSO)algorithm was used to identify a solution.The occupation process was then converted into a formation switching task as part of the cooperative occupation model.A series of simulations were conducted to verify occupational solutions in varying situations,including two-on-two engagement.Simulated results showed these solutions varied with initial conditions and weighting coefficients.This occupation process,based on formation switching,effectively demonstrates the viability of the proposed technique.These cooperative occupation results could provide a theoretical framework for subsequent research in cooperative BVR air combat.
文摘This paper presents a rule-based framework for addressing decision-making problems within the context of the\UI-STRIVE"Competition.First,two distinct autonomous confrontation scenarios are described:autonomous air combat and cooperative interception.Second,a State-Event-Condition-Action(SECA)decision-making framework is developed,which integrates thefinite state machine and event-condition-action frameworks.This framework provides three products to describe rules,i.e.the SECA model,the SECA state chart,and the SECA rule description.Third,the situation assessment and target assignment during autonomous air combat are investigated,and the mathematical models are established.Finally,the decisionmaking model's rationality and feasibility are verified through data simulation and analysis.