This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u...This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.展开更多
In order to prevent the attacker from breaking through the blockade of the interception,deploying multiple Unmanned Aerial Vehicle(UAV)swarms on the interception line is a new combat style.To solve the optimal deploym...In order to prevent the attacker from breaking through the blockade of the interception,deploying multiple Unmanned Aerial Vehicle(UAV)swarms on the interception line is a new combat style.To solve the optimal deployment of swarm positions in the cooperative interception,an optimal deployment optimization model is presented by minimizing the penetration zones'area and the analytical expression of the optimal deployment positions is deduced.Firstly,from the view of the attackers breaking through the interception line,the situations of vertical penetration and oblique penetration are analyzed respectively,and the mathematical models of penetration zones are obtained under the condition of a single UAV swarm and multiple UAV swarms.Secondly,based on the optimization goal of minimizing the penetration area,the optimal deployment optimization model for swarm positions is proposed,and the analytical solution of the optimal deployment is solved by using the convex programming theory.Finally,the proposed optimal deployment is compared with the uniform deployment and random deployment to verify the validity of the theoretical analysis.展开更多
The interception probability of a single missile is the basis for combat plan design and weapon performance evaluation,while its influencing factors are complex and mutually coupled.Existing calculation methods have v...The interception probability of a single missile is the basis for combat plan design and weapon performance evaluation,while its influencing factors are complex and mutually coupled.Existing calculation methods have very limited analysis of the influence mechanism of influencing factors,and none of them has analyzed the influence of the guidance law.This paper considers the influencing factors of both the interceptor and the target more comprehensively.Interceptor parameters include speed,guidance law,guidance error,fuze error,and fragment killing ability,while target performance includes speed,maneuverability,and vulnerability.In this paper,an interception model is established,Monte Carlo simulation is carried out,and the influence mechanism of each factor is analyzed based on the model and simulation results.Finally,this paper proposes a classification-regression neural network to quickly estimate the interception probability based on the value of influencing factors.The proposed method reduces the interference of invalid interception data to valid data,so its prediction accuracy is significantly better than that of pure regression neural networks.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.12072090)。
文摘This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.
文摘In order to prevent the attacker from breaking through the blockade of the interception,deploying multiple Unmanned Aerial Vehicle(UAV)swarms on the interception line is a new combat style.To solve the optimal deployment of swarm positions in the cooperative interception,an optimal deployment optimization model is presented by minimizing the penetration zones'area and the analytical expression of the optimal deployment positions is deduced.Firstly,from the view of the attackers breaking through the interception line,the situations of vertical penetration and oblique penetration are analyzed respectively,and the mathematical models of penetration zones are obtained under the condition of a single UAV swarm and multiple UAV swarms.Secondly,based on the optimization goal of minimizing the penetration area,the optimal deployment optimization model for swarm positions is proposed,and the analytical solution of the optimal deployment is solved by using the convex programming theory.Finally,the proposed optimal deployment is compared with the uniform deployment and random deployment to verify the validity of the theoretical analysis.
基金supported by the Foundation Strengthening Program Technology Field Foundation(2020-JCJQ-JJ-132)。
文摘The interception probability of a single missile is the basis for combat plan design and weapon performance evaluation,while its influencing factors are complex and mutually coupled.Existing calculation methods have very limited analysis of the influence mechanism of influencing factors,and none of them has analyzed the influence of the guidance law.This paper considers the influencing factors of both the interceptor and the target more comprehensively.Interceptor parameters include speed,guidance law,guidance error,fuze error,and fragment killing ability,while target performance includes speed,maneuverability,and vulnerability.In this paper,an interception model is established,Monte Carlo simulation is carried out,and the influence mechanism of each factor is analyzed based on the model and simulation results.Finally,this paper proposes a classification-regression neural network to quickly estimate the interception probability based on the value of influencing factors.The proposed method reduces the interference of invalid interception data to valid data,so its prediction accuracy is significantly better than that of pure regression neural networks.