动能拦截弹(kinetic energy interceptor,KEI)主要用于拦截在助推段、上升段以及中段飞行的中远程和洲际弹道导弹,具有高速、高加速的特点。通过文献资料的研究分析和建模仿真,对KEI导弹的总体参数、气动参数、动力参数进行了反设计和研...动能拦截弹(kinetic energy interceptor,KEI)主要用于拦截在助推段、上升段以及中段飞行的中远程和洲际弹道导弹,具有高速、高加速的特点。通过文献资料的研究分析和建模仿真,对KEI导弹的总体参数、气动参数、动力参数进行了反设计和研究,并对KEI导弹的飞行性能和拦截性能进行了仿真,结果表明:KEI导弹能够在约60 s内加速至6 km/s,并对典型目标具备在助推段/上升段拦截弹道导弹的能力,对国内拦截武器的发展和研究具有参考意义。展开更多
This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u...This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.展开更多
文摘动能拦截弹(kinetic energy interceptor,KEI)主要用于拦截在助推段、上升段以及中段飞行的中远程和洲际弹道导弹,具有高速、高加速的特点。通过文献资料的研究分析和建模仿真,对KEI导弹的总体参数、气动参数、动力参数进行了反设计和研究,并对KEI导弹的飞行性能和拦截性能进行了仿真,结果表明:KEI导弹能够在约60 s内加速至6 km/s,并对典型目标具备在助推段/上升段拦截弹道导弹的能力,对国内拦截武器的发展和研究具有参考意义。
基金supported by the National Natural Science Foundation of China(Grant No.12072090)。
文摘This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.