This paper investigates the guidance method based on reinforcement learning(RL)for the coplanar orbital interception in a continuous low-thrust scenario.The problem is formulated into a Markov decision process(MDP)mod...This paper investigates the guidance method based on reinforcement learning(RL)for the coplanar orbital interception in a continuous low-thrust scenario.The problem is formulated into a Markov decision process(MDP)model,then a welldesigned RL algorithm,experience based deep deterministic policy gradient(EBDDPG),is proposed to solve it.By taking the advantage of prior information generated through the optimal control model,the proposed algorithm not only resolves the convergence problem of the common RL algorithm,but also successfully trains an efficient deep neural network(DNN)controller for the chaser spacecraft to generate the control sequence.Numerical simulation results show that the proposed algorithm is feasible and the trained DNN controller significantly improves the efficiency over traditional optimization methods by roughly two orders of magnitude.展开更多
This paper proposes a fast calculation method to solve all mission opportunities for orbital interception and orbital rendezvous under the impulse-magnitude constraint.Different from the existing search methods,the pr...This paper proposes a fast calculation method to solve all mission opportunities for orbital interception and orbital rendezvous under the impulse-magnitude constraint.Different from the existing search methods,the proposed method does not need to solve Lambert's problem in the whole process.Three cases are considered for either departure time or transfer time being free,or both being free.For fixed departure time,the feasible windows of transfer time are obtained by solving a single-variable nonlinear equation only of terminal true anomaly.Similarly,for fixed interception(or rendezvous)time,the feasible windows of departure time are obtained.For free departure time and free transfer time,all mission opportunities are obtained by using a onedimensional search strategy.The hyperbolic-transfer and the multiple-revolution cases are also analyzed.Numerical results show that the proposed method is superior to the typical pork-chop plot method and the two-dimensional launch window method in computational time.展开更多
基金supported by the National Defense Science and Technology Innovation(18-163-15-LZ-001-004-13).
文摘This paper investigates the guidance method based on reinforcement learning(RL)for the coplanar orbital interception in a continuous low-thrust scenario.The problem is formulated into a Markov decision process(MDP)model,then a welldesigned RL algorithm,experience based deep deterministic policy gradient(EBDDPG),is proposed to solve it.By taking the advantage of prior information generated through the optimal control model,the proposed algorithm not only resolves the convergence problem of the common RL algorithm,but also successfully trains an efficient deep neural network(DNN)controller for the chaser spacecraft to generate the control sequence.Numerical simulation results show that the proposed algorithm is feasible and the trained DNN controller significantly improves the efficiency over traditional optimization methods by roughly two orders of magnitude.
基金supported in part by the National Natural Scientific Foundation of China(No.11772104)the Key Research and Development Plan of Heilongjiang Province,China(No.GZ20210120)the Fundamental Research Funds for the Central Universities,China.
文摘This paper proposes a fast calculation method to solve all mission opportunities for orbital interception and orbital rendezvous under the impulse-magnitude constraint.Different from the existing search methods,the proposed method does not need to solve Lambert's problem in the whole process.Three cases are considered for either departure time or transfer time being free,or both being free.For fixed departure time,the feasible windows of transfer time are obtained by solving a single-variable nonlinear equation only of terminal true anomaly.Similarly,for fixed interception(or rendezvous)time,the feasible windows of departure time are obtained.For free departure time and free transfer time,all mission opportunities are obtained by using a onedimensional search strategy.The hyperbolic-transfer and the multiple-revolution cases are also analyzed.Numerical results show that the proposed method is superior to the typical pork-chop plot method and the two-dimensional launch window method in computational time.