Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Crit...Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.展开更多
To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles(UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method ba...To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles(UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method based on an improved deep reinforcement learning(DRL) algorithm: the multistep double deep Q-network(MS-DDQN) algorithm. First, a six-degree-of-freedom UCAV model based on an aircraft control system is established on a simulation platform, and the situation assessment functions of the UCAV and its target are established by considering their angles, altitudes, environments, missile attack performances, and UCAV performance. By controlling the flight path angle, roll angle, and flight velocity, 27 common basic actions are designed. On this basis, aiming to overcome the defects of traditional DRL in terms of training speed and convergence speed, the improved MS-DDQN method is introduced to incorporate the final return value into the previous steps. Finally, the pre-training learning model is used as the starting point for the second learning model to simulate the UCAV aerial combat decision-making process based on the basic training method, which helps to shorten the training time and improve the learning efficiency. The improved DRL algorithm significantly accelerates the training speed and estimates the target value more accurately during training, and it can be applied to aerial combat decision-making.展开更多
In order to achieve the optimal attack outcome in the air combat under the beyond visual range(BVR)condition,the decision-making(DM)problem which is to set a proper assignment for the friendly fighters on the hostile ...In order to achieve the optimal attack outcome in the air combat under the beyond visual range(BVR)condition,the decision-making(DM)problem which is to set a proper assignment for the friendly fighters on the hostile fighters is the most crucial task for cooperative multiple target attack(CMTA).In this paper,a heuristic quantum genetic algorithm(HQGA)is proposed to solve the DM problem.The originality of our work can be supported in the following aspects:(1)the HQGA assigns all hostile fighters to every missile rather than fighters so that the HQGA can encode chromosomes with quantum bits(Q-bits);(2)the relative successful sequence probability(RSSP)is defined,based on which the priority attack vector is constructed;(3)the HQGA can heuristically modify quantum chromosomes according to modification technique proposed in this paper;(4)last but not the least,in some special conditions,the HQGA gets rid of the constraint described by other algorithms that to obtain a better result.In the end of this paper,two examples are illustrated to show that the HQGA has its own advantage over other algorithms when dealing with the DM problem in the context of CMTA.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:62003267Fundamental Research Funds for the Central Universities,Grant/Award Number:G2022KY0602+1 种基金Technology on Electromagnetic Space Operations and Applications Laboratory,Grant/Award Number:2022ZX0090Key Core Technology Research Plan of Xi'an,Grant/Award Number:21RGZN0016。
文摘Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.
基金supported by the National Natural Science Foundation of China (No. 61573286)the Aeronautical Science Foundation of China (No. 20180753006)+2 种基金the Fundamental Research Funds for the Central Universities (3102019ZDHKY07)the Natural Science Foundation of Shaanxi Province (2019JM-163, 2020JQ-218)the Shaanxi Province Key Laboratory of Flight Control and Simulation Technology。
文摘To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles(UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method based on an improved deep reinforcement learning(DRL) algorithm: the multistep double deep Q-network(MS-DDQN) algorithm. First, a six-degree-of-freedom UCAV model based on an aircraft control system is established on a simulation platform, and the situation assessment functions of the UCAV and its target are established by considering their angles, altitudes, environments, missile attack performances, and UCAV performance. By controlling the flight path angle, roll angle, and flight velocity, 27 common basic actions are designed. On this basis, aiming to overcome the defects of traditional DRL in terms of training speed and convergence speed, the improved MS-DDQN method is introduced to incorporate the final return value into the previous steps. Finally, the pre-training learning model is used as the starting point for the second learning model to simulate the UCAV aerial combat decision-making process based on the basic training method, which helps to shorten the training time and improve the learning efficiency. The improved DRL algorithm significantly accelerates the training speed and estimates the target value more accurately during training, and it can be applied to aerial combat decision-making.
基金supported by National Nature Science Foundation of China,and the supporting project is“Study on parallel intelligent optimization simulation with combination of qualitative and quantitative method”(61004089)supported by the Graduate Student Innovation Practice Foundation of Beihang University in China(YCSJ-01-201205),which is“Research of an efficient and intelligent optimization method and application in aircraft shape design”.
文摘In order to achieve the optimal attack outcome in the air combat under the beyond visual range(BVR)condition,the decision-making(DM)problem which is to set a proper assignment for the friendly fighters on the hostile fighters is the most crucial task for cooperative multiple target attack(CMTA).In this paper,a heuristic quantum genetic algorithm(HQGA)is proposed to solve the DM problem.The originality of our work can be supported in the following aspects:(1)the HQGA assigns all hostile fighters to every missile rather than fighters so that the HQGA can encode chromosomes with quantum bits(Q-bits);(2)the relative successful sequence probability(RSSP)is defined,based on which the priority attack vector is constructed;(3)the HQGA can heuristically modify quantum chromosomes according to modification technique proposed in this paper;(4)last but not the least,in some special conditions,the HQGA gets rid of the constraint described by other algorithms that to obtain a better result.In the end of this paper,two examples are illustrated to show that the HQGA has its own advantage over other algorithms when dealing with the DM problem in the context of CMTA.