动能拦截弹(kinetic energy interceptor,KEI)主要用于拦截在助推段、上升段以及中段飞行的中远程和洲际弹道导弹,具有高速、高加速的特点。通过文献资料的研究分析和建模仿真,对KEI导弹的总体参数、气动参数、动力参数进行了反设计和研...动能拦截弹(kinetic energy interceptor,KEI)主要用于拦截在助推段、上升段以及中段飞行的中远程和洲际弹道导弹,具有高速、高加速的特点。通过文献资料的研究分析和建模仿真,对KEI导弹的总体参数、气动参数、动力参数进行了反设计和研究,并对KEI导弹的飞行性能和拦截性能进行了仿真,结果表明:KEI导弹能够在约60 s内加速至6 km/s,并对典型目标具备在助推段/上升段拦截弹道导弹的能力,对国内拦截武器的发展和研究具有参考意义。展开更多
This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u...This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.展开更多
This paper presents a neighborhood optimal trajectory online correction algorithm considering terminal time variation,and investigates its application range.Firstly,the motion model of midcourse guidance is establishe...This paper presents a neighborhood optimal trajectory online correction algorithm considering terminal time variation,and investigates its application range.Firstly,the motion model of midcourse guidance is established,and the online trajectory correction-regenerating strategy is introduced.Secondly,based on the neighborhood optimal control theory,a neighborhood optimal trajectory online correction algorithm considering the terminal time variation is proposed by adding the consideration of terminal time variation to the traditional neighborhood optimal trajectory correction method.Thirdly,the Monte Carlo simulation method is used to analyze the application range of the algorithm,which provides a basis for the division of application domain of the online correction algorithm and the online regeneration algorithm of midcourse guidance trajectory.Finally,the simulation results show that the algorithm has high real-time performance,and the online correction trajectory can meet the requirements of terminal constraint change.The application range of the algorithm is obtained through Monte Carlo simulation.展开更多
文摘动能拦截弹(kinetic energy interceptor,KEI)主要用于拦截在助推段、上升段以及中段飞行的中远程和洲际弹道导弹,具有高速、高加速的特点。通过文献资料的研究分析和建模仿真,对KEI导弹的总体参数、气动参数、动力参数进行了反设计和研究,并对KEI导弹的飞行性能和拦截性能进行了仿真,结果表明:KEI导弹能够在约60 s内加速至6 km/s,并对典型目标具备在助推段/上升段拦截弹道导弹的能力,对国内拦截武器的发展和研究具有参考意义。
基金supported by the National Natural Science Foundation of China(Grant No.12072090)。
文摘This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.
基金supported by the National Natural Science Foundation of China(61873278,62173339)。
文摘This paper presents a neighborhood optimal trajectory online correction algorithm considering terminal time variation,and investigates its application range.Firstly,the motion model of midcourse guidance is established,and the online trajectory correction-regenerating strategy is introduced.Secondly,based on the neighborhood optimal control theory,a neighborhood optimal trajectory online correction algorithm considering the terminal time variation is proposed by adding the consideration of terminal time variation to the traditional neighborhood optimal trajectory correction method.Thirdly,the Monte Carlo simulation method is used to analyze the application range of the algorithm,which provides a basis for the division of application domain of the online correction algorithm and the online regeneration algorithm of midcourse guidance trajectory.Finally,the simulation results show that the algorithm has high real-time performance,and the online correction trajectory can meet the requirements of terminal constraint change.The application range of the algorithm is obtained through Monte Carlo simulation.