Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem ...Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem of plasticity loss in neural networks.Plasticity loss is the difficulty of learning new knowledge after the network has converged.To this end,we propose a motivational curriculum learning distributed proximal policy optimization(MCLDPPO)algorithm,through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods.The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide.Furthermore,a complete tactical maneuver is encapsulated based on the existing air combat knowledge,and through the flexible use of these maneuvers,some tactics beyond human knowledge can be realized.In addition,we designed an interruption mechanism for the agent to increase the frequency of decisionmaking when the agent faces an emergency.When the number of threats received by the agent changes,the current action is interrupted in order to reacquire observations and make decisions again.Using the interruption mechanism can significantly improve the performance of the agent.To simulate actual air combat better,we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously,effectively improving data throughput.The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat,verifying the effectiveness of the algorithmic framework proposed in this paper.展开更多
Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is ...Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is well-suited to tackle such complex states and actions. However, it is not necessary to fuzzify the variables that have definite discrete semantics.Hence, the aim of this study is to improve the level of model abstraction by proposing multiple levels of cascaded hierarchical structures from the perspective of function, namely, the functional decision tree. This method is developed to represent behavioral modeling of air combat systems, and its metamodel,execution mechanism, and code generation can provide a sound basis for function-based behavioral modeling. As a proof of concept, an air combat simulation is developed to validate this method and the results show that the fighter Alpha built using the proposed framework provides better performance than that using default scripts.展开更多
In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position beco...In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position becomes the primary goal of maneuver decision-making.By taking the position as the UAV’s maneuver strategy,this paper constructs the optimal confrontation position selecting games(OCPSGs)model.In the OCPSGs model,the payoff function of each UAV is defined by the difference between the comprehensive advantages of both sides,and the strategy space of each UAV at every step is defined by its accessible space determined by the maneuverability.Then we design the limit approximation of mixed strategy Nash equilibrium(LAMSNQ)algorithm,which provides a method to determine the optimal probability distribution of positions in the strategy space.In the simulation phase,we assume the motions on three directions are independent and the strategy space is a cuboid to simplify the model.Several simulations are performed to verify the feasibility,effectiveness and stability of the algorithm.展开更多
With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient ...With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient mission consideration and single evaluation dimension in the existing evaluation approaches,this study proposes a mission-oriented capability evaluation method for combat systems based on operation loop.Firstly,a combat network model is given that takes into account the capability properties of combat nodes.Then,based on the transition matrix between combat nodes,an efficient algorithm for operation loop identification is proposed based on the Breadth-First Search.Given the mission-capability satisfaction of nodes,the effectiveness evaluation indexes for operation loops and combat network are proposed,followed by node importance measure.Through a case study of the combat scenario involving space-based support against surface ships under different strategies,the effectiveness of the proposed method is verified.The results indicated that the ROI-priority attack method has a notable impact on reducing the overall efficiency of the network,whereas the O-L betweenness-priority attack is more effective in obstructing the successful execution of enemy attack missions.展开更多
Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the comb...Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.展开更多
Lying in her makeshift hospital bed,Joyce Tembo thanked medical personnel for evacuating her to the designated national cholera treatment centre,6 km north of Zambia’s capital Lusaka.She was recently diagnosed with d...Lying in her makeshift hospital bed,Joyce Tembo thanked medical personnel for evacuating her to the designated national cholera treatment centre,6 km north of Zambia’s capital Lusaka.She was recently diagnosed with diarrhoeal disease.Tembo,43,commended the medical sta!stationed at the treatment centre for their great service to thousands of patients,especially women and children seeking urgent treatment.“I am very grateful to the Chinese doctors who attended to me as soon as the ambulance rushed me to the clinic where I received urgent treatment;they have really saved my life,”Tembo told ChinAfrica.But not all residents in her community are as lucky as her.Many in the densely populated slums die every day due to the area’s poor sanitation-one of the major causes of the cholera outbreak.展开更多
Since free combat is a competitive sport that flexibly utilizes kicking,punching,wrestling,and holding techniques to defeat the opponent,a good core strength of athletes can help to improve the technical level,enhance...Since free combat is a competitive sport that flexibly utilizes kicking,punching,wrestling,and holding techniques to defeat the opponent,a good core strength of athletes can help to improve the technical level,enhance the quality of movements,and protect the joints and muscles.In order to carry out core strength training in free combat teaching with high quality,firstly,it is necessary for coaches to carry out simple training,centralized training,and extended training according to the basic planning of adaptation-stabilization-improvement.Secondly,it is also important to test the athlete’s physical and athletic qualities before implementing the specific training plan,optimize the training program,and carry out statistical analysis of the stage training data in order to achieve the best training effect.展开更多
Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm opt...Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.展开更多
The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susce...The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susceptibility. A tactical scenario for a strike mission is presented. The effect of aircraft radar cross section on the detection probability of a threat radar, as well as that of onboard jammer, are investigated. The guidance errors of radar guided surface to air missile and anti aircraft artillery, which are disturbed by radar cross section reduction or jammer radiated power and both of them are determined. The probability of aircraft kill given a single shot is calculated and finally the sortie survivability of an attack aircraft in a supposed hostile thread environment worked out. It is demonstrated that the survivability of a combat aircraft will be greatly enhanced by the combined radar stealth and onboard electronic attack, and the evaluation metho dology is effective and applicable.展开更多
At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that targe...At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that target penetrates the defended area along any flight path is established by the state analysis and statistical equilibrium analysis of stochastic service system theory. The simulated annealing algorithm is an enlightening random search method based on Monte Carlo recursion, and it can find global optimal solution by simulating annealing process. Combining stochastic model to compete the probability and simulated annealing algorithm, this paper establishes the method to solve problem quantitatively about combat configuration optimization of weapon systems. The calculated result shows that the perfect configuration for fire cells of the weapon is fast found by using this method, and this quantificational method for combat configuration is faster and more scientific than previous one based on principle via map fire field.展开更多
A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrum...A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier.展开更多
文摘Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem of plasticity loss in neural networks.Plasticity loss is the difficulty of learning new knowledge after the network has converged.To this end,we propose a motivational curriculum learning distributed proximal policy optimization(MCLDPPO)algorithm,through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods.The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide.Furthermore,a complete tactical maneuver is encapsulated based on the existing air combat knowledge,and through the flexible use of these maneuvers,some tactics beyond human knowledge can be realized.In addition,we designed an interruption mechanism for the agent to increase the frequency of decisionmaking when the agent faces an emergency.When the number of threats received by the agent changes,the current action is interrupted in order to reacquire observations and make decisions again.Using the interruption mechanism can significantly improve the performance of the agent.To simulate actual air combat better,we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously,effectively improving data throughput.The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat,verifying the effectiveness of the algorithmic framework proposed in this paper.
基金This work was supported by the National Natural Science Foundation of China(62003359).
文摘Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is well-suited to tackle such complex states and actions. However, it is not necessary to fuzzify the variables that have definite discrete semantics.Hence, the aim of this study is to improve the level of model abstraction by proposing multiple levels of cascaded hierarchical structures from the perspective of function, namely, the functional decision tree. This method is developed to represent behavioral modeling of air combat systems, and its metamodel,execution mechanism, and code generation can provide a sound basis for function-based behavioral modeling. As a proof of concept, an air combat simulation is developed to validate this method and the results show that the fighter Alpha built using the proposed framework provides better performance than that using default scripts.
基金National Key R&D Program of China(Grant No.2021YFA1000402)National Natural Science Foundation of China(Grant No.72071159)to provide fund for conducting experiments。
文摘In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position becomes the primary goal of maneuver decision-making.By taking the position as the UAV’s maneuver strategy,this paper constructs the optimal confrontation position selecting games(OCPSGs)model.In the OCPSGs model,the payoff function of each UAV is defined by the difference between the comprehensive advantages of both sides,and the strategy space of each UAV at every step is defined by its accessible space determined by the maneuverability.Then we design the limit approximation of mixed strategy Nash equilibrium(LAMSNQ)algorithm,which provides a method to determine the optimal probability distribution of positions in the strategy space.In the simulation phase,we assume the motions on three directions are independent and the strategy space is a cuboid to simplify the model.Several simulations are performed to verify the feasibility,effectiveness and stability of the algorithm.
文摘With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient mission consideration and single evaluation dimension in the existing evaluation approaches,this study proposes a mission-oriented capability evaluation method for combat systems based on operation loop.Firstly,a combat network model is given that takes into account the capability properties of combat nodes.Then,based on the transition matrix between combat nodes,an efficient algorithm for operation loop identification is proposed based on the Breadth-First Search.Given the mission-capability satisfaction of nodes,the effectiveness evaluation indexes for operation loops and combat network are proposed,followed by node importance measure.Through a case study of the combat scenario involving space-based support against surface ships under different strategies,the effectiveness of the proposed method is verified.The results indicated that the ROI-priority attack method has a notable impact on reducing the overall efficiency of the network,whereas the O-L betweenness-priority attack is more effective in obstructing the successful execution of enemy attack missions.
文摘Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.
文摘Lying in her makeshift hospital bed,Joyce Tembo thanked medical personnel for evacuating her to the designated national cholera treatment centre,6 km north of Zambia’s capital Lusaka.She was recently diagnosed with diarrhoeal disease.Tembo,43,commended the medical sta!stationed at the treatment centre for their great service to thousands of patients,especially women and children seeking urgent treatment.“I am very grateful to the Chinese doctors who attended to me as soon as the ambulance rushed me to the clinic where I received urgent treatment;they have really saved my life,”Tembo told ChinAfrica.But not all residents in her community are as lucky as her.Many in the densely populated slums die every day due to the area’s poor sanitation-one of the major causes of the cholera outbreak.
文摘Since free combat is a competitive sport that flexibly utilizes kicking,punching,wrestling,and holding techniques to defeat the opponent,a good core strength of athletes can help to improve the technical level,enhance the quality of movements,and protect the joints and muscles.In order to carry out core strength training in free combat teaching with high quality,firstly,it is necessary for coaches to carry out simple training,centralized training,and extended training according to the basic planning of adaptation-stabilization-improvement.Secondly,it is also important to test the athlete’s physical and athletic qualities before implementing the specific training plan,optimize the training program,and carry out statistical analysis of the stage training data in order to achieve the best training effect.
文摘Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.
文摘The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susceptibility. A tactical scenario for a strike mission is presented. The effect of aircraft radar cross section on the detection probability of a threat radar, as well as that of onboard jammer, are investigated. The guidance errors of radar guided surface to air missile and anti aircraft artillery, which are disturbed by radar cross section reduction or jammer radiated power and both of them are determined. The probability of aircraft kill given a single shot is calculated and finally the sortie survivability of an attack aircraft in a supposed hostile thread environment worked out. It is demonstrated that the survivability of a combat aircraft will be greatly enhanced by the combined radar stealth and onboard electronic attack, and the evaluation metho dology is effective and applicable.
文摘At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that target penetrates the defended area along any flight path is established by the state analysis and statistical equilibrium analysis of stochastic service system theory. The simulated annealing algorithm is an enlightening random search method based on Monte Carlo recursion, and it can find global optimal solution by simulating annealing process. Combining stochastic model to compete the probability and simulated annealing algorithm, this paper establishes the method to solve problem quantitatively about combat configuration optimization of weapon systems. The calculated result shows that the perfect configuration for fire cells of the weapon is fast found by using this method, and this quantificational method for combat configuration is faster and more scientific than previous one based on principle via map fire field.
基金supported by the Aviation Science Foundation of China(20152096019)
文摘A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier.