Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles(UCAVs)for cooperative occupation dur...Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles(UCAVs)for cooperative occupation during beyond-visual-range(BVR)air combat.However,prior research on occupational decision-making in BVR air combat has mostly been limited to one-on-one scenarios.As such,this study presents a practical cooperative occupation decision-making methodology for use with multiple UCAVs.The weapon engagement zone(WEZ)and combat geometry were first used to develop an advantage function for situational assessment of one-on-one engagement.An encircling advantage function was then designed to represent the cooperation of UCAVs,thereby establishing a cooperative occupation model.The corresponding objective function was derived from the one-on-one engagement advantage function and the encircling advantage function.The resulting model exhibited similarities to a mixed-integer nonlinear programming(MINLP)problem.As such,an improved discrete particle swarm optimization(DPSO)algorithm was used to identify a solution.The occupation process was then converted into a formation switching task as part of the cooperative occupation model.A series of simulations were conducted to verify occupational solutions in varying situations,including two-on-two engagement.Simulated results showed these solutions varied with initial conditions and weighting coefficients.This occupation process,based on formation switching,effectively demonstrates the viability of the proposed technique.These cooperative occupation results could provide a theoretical framework for subsequent research in cooperative BVR air combat.展开更多
Online accurate recognition of target tactical intention in beyond-visual-range (BVR) air combat is an important basis for deep situational awareness and autonomous air combat decision-making, which can create pre-emp...Online accurate recognition of target tactical intention in beyond-visual-range (BVR) air combat is an important basis for deep situational awareness and autonomous air combat decision-making, which can create pre-emptive tactical opportunities for the fighter to gain air superiority. The existing methods to solve this problem have some defects such as dependence on empirical knowledge, difficulty in interpreting the recognition results, and inability to meet the requirements of actual air combat. So an online hierarchical recognition method for target tactical intention in BVR air combat based on cascaded support vector machine (CSVM) is proposed in this study. Through the mechanism analysis of BVR air combat, the instantaneous and cumulative feature information of target trajectory and relative situation information are introduced successively using online automatic decomposition of target trajectory and hierarchical progression. Then the hierarchical recognition model from target maneuver element, tactical maneuver to tactical intention is constructed. The CSVM algorithm is designed for solving this model, and the computational complexity is decomposed by the cascaded structure to overcome the problems of convergence and timeliness when the dimensions and number of training samples are large. Meanwhile, the recognition result of each layer can be used to support the composition analysis and interpretation of target tactical intention. The simulation results show that the proposed method can effectively realize multi-dimensional online accurate recognition of target tactical intention in BVR air combat.展开更多
Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance ...Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.展开更多
In modern Beyond-Visual-Range(BVR)aerial combat,unmanned loyal wingmen are pivotal,yet their autonomous capabilities are limited.Our study introduces an advanced control algorithm based on hierarchical reinforcement l...In modern Beyond-Visual-Range(BVR)aerial combat,unmanned loyal wingmen are pivotal,yet their autonomous capabilities are limited.Our study introduces an advanced control algorithm based on hierarchical reinforcement learning to enhance these capabilities for critical missions like target search,positioning,and relay guidance.Structured on a dual-layer model,the algorithm’s lower layer manages basic aircraft maneuvers for optimal flight,while the upper layer processes battlefield dynamics,issuing precise navigational commands.This approach enables accurate navigation and effective reconnaissance for lead aircraft.Notably,our Hierarchical Prior-augmented Proximal Policy Optimization(HPE-PPO)algorithm employs a prior-based training,prior-free execution method,accelerating target positioning training and ensuring robust target reacquisition.This paper also improves missile relay guidance and promotes the effective guidance.By integrating this system with a human-piloted lead aircraft,this paper proposes a potent solution for cooperative aerial warfare.Rigorous experiments demonstrate enhanced survivability and efficiency of loyal wingmen,marking a significant contribution to Unmanned Aerial Vehicles(UAV)formation control research.This advancement is poised to drive substantial interest and progress in the related technological fields.展开更多
基金supported by the National Natural Science Foundation of China(No.61573286)the Aeronautical Science Foundation of China(No.20180753006)+2 种基金the Fundamental Research Funds for the Central Universities(3102019ZDHKY07)the Natural Science Foundation of Shaanxi Province(2020JQ-218)the Shaanxi Province Key Laboratory of Flight Control and Simulation Technology。
文摘Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles(UCAVs)for cooperative occupation during beyond-visual-range(BVR)air combat.However,prior research on occupational decision-making in BVR air combat has mostly been limited to one-on-one scenarios.As such,this study presents a practical cooperative occupation decision-making methodology for use with multiple UCAVs.The weapon engagement zone(WEZ)and combat geometry were first used to develop an advantage function for situational assessment of one-on-one engagement.An encircling advantage function was then designed to represent the cooperation of UCAVs,thereby establishing a cooperative occupation model.The corresponding objective function was derived from the one-on-one engagement advantage function and the encircling advantage function.The resulting model exhibited similarities to a mixed-integer nonlinear programming(MINLP)problem.As such,an improved discrete particle swarm optimization(DPSO)algorithm was used to identify a solution.The occupation process was then converted into a formation switching task as part of the cooperative occupation model.A series of simulations were conducted to verify occupational solutions in varying situations,including two-on-two engagement.Simulated results showed these solutions varied with initial conditions and weighting coefficients.This occupation process,based on formation switching,effectively demonstrates the viability of the proposed technique.These cooperative occupation results could provide a theoretical framework for subsequent research in cooperative BVR air combat.
基金The authors gratefully acknowledge the support of the National Natural Science Foundation of China under Grant No.62076204 and Grant No.61612385in part by the Postdoctoral Science Foundation of China under Grants No.2021M700337in part by the Fundamental Research Funds for the Central Universities under Grant No.3102019ZX016.
文摘Online accurate recognition of target tactical intention in beyond-visual-range (BVR) air combat is an important basis for deep situational awareness and autonomous air combat decision-making, which can create pre-emptive tactical opportunities for the fighter to gain air superiority. The existing methods to solve this problem have some defects such as dependence on empirical knowledge, difficulty in interpreting the recognition results, and inability to meet the requirements of actual air combat. So an online hierarchical recognition method for target tactical intention in BVR air combat based on cascaded support vector machine (CSVM) is proposed in this study. Through the mechanism analysis of BVR air combat, the instantaneous and cumulative feature information of target trajectory and relative situation information are introduced successively using online automatic decomposition of target trajectory and hierarchical progression. Then the hierarchical recognition model from target maneuver element, tactical maneuver to tactical intention is constructed. The CSVM algorithm is designed for solving this model, and the computational complexity is decomposed by the cascaded structure to overcome the problems of convergence and timeliness when the dimensions and number of training samples are large. Meanwhile, the recognition result of each layer can be used to support the composition analysis and interpretation of target tactical intention. The simulation results show that the proposed method can effectively realize multi-dimensional online accurate recognition of target tactical intention in BVR air combat.
基金co-supported by the National Natural Science Foundation of China(No.52272382)the Aeronautical Science Foundation of China(No.20200017051001)the Fundamental Research Funds for the Central Universities,China.
文摘Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.
基金This study was co-supported by the Natural Science Basic Research Program of Shaanxi,China(No.2022JQ-593)the Key R&D Program of Shaanxi Provincial Department of Science and Technology,China(No.2022GY-089)the Aeronautical Science Foundation of China(No.20220013053005).
文摘In modern Beyond-Visual-Range(BVR)aerial combat,unmanned loyal wingmen are pivotal,yet their autonomous capabilities are limited.Our study introduces an advanced control algorithm based on hierarchical reinforcement learning to enhance these capabilities for critical missions like target search,positioning,and relay guidance.Structured on a dual-layer model,the algorithm’s lower layer manages basic aircraft maneuvers for optimal flight,while the upper layer processes battlefield dynamics,issuing precise navigational commands.This approach enables accurate navigation and effective reconnaissance for lead aircraft.Notably,our Hierarchical Prior-augmented Proximal Policy Optimization(HPE-PPO)algorithm employs a prior-based training,prior-free execution method,accelerating target positioning training and ensuring robust target reacquisition.This paper also improves missile relay guidance and promotes the effective guidance.By integrating this system with a human-piloted lead aircraft,this paper proposes a potent solution for cooperative aerial warfare.Rigorous experiments demonstrate enhanced survivability and efficiency of loyal wingmen,marking a significant contribution to Unmanned Aerial Vehicles(UAV)formation control research.This advancement is poised to drive substantial interest and progress in the related technological fields.