This paper proposes an optimal,robust,and efficient guidance scheme for the perturbed minimum-time low-thrust transfer toward the geostationary orbit.The Earth’s oblateness perturbation and shadow are taken into acco...This paper proposes an optimal,robust,and efficient guidance scheme for the perturbed minimum-time low-thrust transfer toward the geostationary orbit.The Earth’s oblateness perturbation and shadow are taken into account.It is difficult for a Lyapunov-based or trajectory-tracking guidance method to possess multiple characteristics at the same time,including high guidance optimality,robustness,and onboard computational efficiency.In this work,a concise relationship between the minimum-time transfer problem with orbital averaging and its optimal solution is identified,which reveals that the five averaged initial costates that dominate the optimal thrust direction can be approximately determined by only four initial modified equinoctial orbit elements after a coordinate transformation.Based on this relationship,the optimal averaged trajectories constituting the training dataset are randomly generated around a nominal averaged trajectory.Five polynomial regression models are trained on the training dataset and are regarded as the costate estimators.In the transfer,the spacecraft can obtain the real-time approximate optimal thrust direction by combining the costate estimations provided by the estimators with the current state at any time.Moreover,all these computations onboard are analytical.The simulation results show that the proposed guidance scheme possesses extremely high guidance optimality,robustness,and onboard computational efficiency.展开更多
In this study,a real-time optimal control approach is proposed using an interactive deep reinforcement learning algorithm for the Moon fuel-optimal landing problem.Considering the remote communication restrictions and...In this study,a real-time optimal control approach is proposed using an interactive deep reinforcement learning algorithm for the Moon fuel-optimal landing problem.Considering the remote communication restrictions and environmental uncertainties,advanced landing control techniques are demanded to meet the high requirements of real-time performance and autonomy in the Moon landing missions.Deep reinforcement learning(DRL)algorithms have been recently developed for real-time optimal control but suffer the obstacles of slow convergence and difficult reward function design.To address these problems,a DRL algorithm is developed using an actor-indirect method architecture to achieve the optimal control of the Moon landing mission.In this DRL algorithm,an indirect method is employed to generate the optimal control actions for the deep neural network(DNN)learning,while the trained DNNs provide good initial guesses for the indirect method to promote the efficiency of training data generation.Through sufficient learning of the state-action relationship,the trained DNNs can approximate the optimal actions and steer the spacecraft to the target in real time.Additionally,a nonlinear feedback controller is developed to improve the terminal landing accuracy.Numerical simulations are given to verify the effectiveness of the proposed DRL algorithm and demonstrate the performance of the developed optimal landing controller.展开更多
This paper presents the crucial method for obtaining our team's results in the 8th Global Trajectory Optimization Competition(GTOC8).Because the positions and velocities of spacecraft cannot be completely determin...This paper presents the crucial method for obtaining our team's results in the 8th Global Trajectory Optimization Competition(GTOC8).Because the positions and velocities of spacecraft cannot be completely determined by one observation on one radio source,the branch and bound method for sequence optimization of multi-asteroid exploration cannot be directly applied here.To overcome this diculty,an optimization method for searching the observing sequence based on nominal low-thrust trajectories of the symmetric observing con guration is proposed.With the symmetric observing con guration,the normal vector of the triangle plane formed by the three spacecraft rotates in the ecliptic plane periodically and approximately points to the radio sources which are close to the ecliptic plane.All possible observing opportunities are selected and ranked according to the nominal trajectories designed by the symmetric observing con guration.First,the branch and bound method is employed to nd the optimal sequence of the radio source with thrice observations.Second,this method is also used to nd the optimal sequence of the left radio sources.The nominal trajectories are then corrected for accurate observations.The performance index of our result is 128,286,317.0 km which ranks the second place in GTOC8.展开更多
A set of linearized relative motion equations of spacecraft flying on unperturbed elliptical orbits are specialized for particular cases, where the leader orbit is circular or equatorial. Based on these extended equat...A set of linearized relative motion equations of spacecraft flying on unperturbed elliptical orbits are specialized for particular cases, where the leader orbit is circular or equatorial. Based on these extended equations, we are able to analyze the relative motion regulation between a pair of spacecraft flying on arbitrary unperturbed orbits with the same semi-major axis in close formation. Given the initial orbital elements of the leader, this paper presents a simple way to design initial relative orbital elements of close spacecraft with the same semi-major axis, thus preventing collision under non-perturbed conditions. Considering the mean influence of J_2 perturbation, namely secular J_2 perturbation, we derive the mean derivatives of orbital element differences, and then expand them to first order. Thus the first order expansion of orbital element differences can be added to the relative motion equations for further analysis. For a pair of spacecraft that will never collide under non-perturbed situations, we present a simple method to determine whether a collision will occur when J_2 perturbation is considered. Examples are given to prove the validity of the extended relative motion equations and to illustrate how the methods presented can be used. The simple method for designing initial relative orbital elements proposed here could be helpful to the preliminary design of the relative orbital elements between spacecraft in a close formation, when collision avoidance is necessary.展开更多
An intelligent solution method is proposed to achieve real-time optimal control for continuous-time nonlinear systems using a novel identifier-actor-optimizer(IAO)policy learning architecture.In this IAO-based policy ...An intelligent solution method is proposed to achieve real-time optimal control for continuous-time nonlinear systems using a novel identifier-actor-optimizer(IAO)policy learning architecture.In this IAO-based policy learning approach,a dynamical identifier is developed to approximate the unknown part of system dynamics using deep neural networks(DNNs).Then,an indirect-method-based optimizer is proposed to generate high-quality optimal actions for system control considering both the constraints and performance index.Furthermore,a DNN-based actor is developed to approximate the obtained optimal actions and return good initial guesses to the optimizer.In this way,the traditional optimal control methods and state-of-the-art DNN techniques are combined in the IAO-based optimal policy learning method.Compared to the reinforcement learning algorithms with actor-critic architectures that suffer hard reward design and low computational efficiency,the IAO-based optimal policy learning algorithm enjoys fewer user-defined parameters,higher learning speeds,and steadier convergence properties in solving complex continuous-time optimal control problems(OCPs).Simulation results of three space flight control missions are given to substantiate the effectiveness of this IAO-based policy learning strategy and to illustrate the performance of the developed DNN-based optimal control method for continuous-time OCPs.展开更多
基金supported by the National Natural Science Foundation of China(No.12022214)the National Key R&D Program of China(No.2020YFC2201200)。
文摘This paper proposes an optimal,robust,and efficient guidance scheme for the perturbed minimum-time low-thrust transfer toward the geostationary orbit.The Earth’s oblateness perturbation and shadow are taken into account.It is difficult for a Lyapunov-based or trajectory-tracking guidance method to possess multiple characteristics at the same time,including high guidance optimality,robustness,and onboard computational efficiency.In this work,a concise relationship between the minimum-time transfer problem with orbital averaging and its optimal solution is identified,which reveals that the five averaged initial costates that dominate the optimal thrust direction can be approximately determined by only four initial modified equinoctial orbit elements after a coordinate transformation.Based on this relationship,the optimal averaged trajectories constituting the training dataset are randomly generated around a nominal averaged trajectory.Five polynomial regression models are trained on the training dataset and are regarded as the costate estimators.In the transfer,the spacecraft can obtain the real-time approximate optimal thrust direction by combining the costate estimations provided by the estimators with the current state at any time.Moreover,all these computations onboard are analytical.The simulation results show that the proposed guidance scheme possesses extremely high guidance optimality,robustness,and onboard computational efficiency.
基金This work is supported by the National Natural Science Foundation of China(Grants Nos.11672146 and 11432001).
文摘In this study,a real-time optimal control approach is proposed using an interactive deep reinforcement learning algorithm for the Moon fuel-optimal landing problem.Considering the remote communication restrictions and environmental uncertainties,advanced landing control techniques are demanded to meet the high requirements of real-time performance and autonomy in the Moon landing missions.Deep reinforcement learning(DRL)algorithms have been recently developed for real-time optimal control but suffer the obstacles of slow convergence and difficult reward function design.To address these problems,a DRL algorithm is developed using an actor-indirect method architecture to achieve the optimal control of the Moon landing mission.In this DRL algorithm,an indirect method is employed to generate the optimal control actions for the deep neural network(DNN)learning,while the trained DNNs provide good initial guesses for the indirect method to promote the efficiency of training data generation.Through sufficient learning of the state-action relationship,the trained DNNs can approximate the optimal actions and steer the spacecraft to the target in real time.Additionally,a nonlinear feedback controller is developed to improve the terminal landing accuracy.Numerical simulations are given to verify the effectiveness of the proposed DRL algorithm and demonstrate the performance of the developed optimal landing controller.
基金the National Natural Science Foundation of China(Grant Nos.11672146 and 11432001)The authors thank the organizer of GTOC8.
文摘This paper presents the crucial method for obtaining our team's results in the 8th Global Trajectory Optimization Competition(GTOC8).Because the positions and velocities of spacecraft cannot be completely determined by one observation on one radio source,the branch and bound method for sequence optimization of multi-asteroid exploration cannot be directly applied here.To overcome this diculty,an optimization method for searching the observing sequence based on nominal low-thrust trajectories of the symmetric observing con guration is proposed.With the symmetric observing con guration,the normal vector of the triangle plane formed by the three spacecraft rotates in the ecliptic plane periodically and approximately points to the radio sources which are close to the ecliptic plane.All possible observing opportunities are selected and ranked according to the nominal trajectories designed by the symmetric observing con guration.First,the branch and bound method is employed to nd the optimal sequence of the radio source with thrice observations.Second,this method is also used to nd the optimal sequence of the left radio sources.The nominal trajectories are then corrected for accurate observations.The performance index of our result is 128,286,317.0 km which ranks the second place in GTOC8.
基金supported by the National Natural Science Foundation of China(Grant Nos.11572166,and 11672146)
文摘A set of linearized relative motion equations of spacecraft flying on unperturbed elliptical orbits are specialized for particular cases, where the leader orbit is circular or equatorial. Based on these extended equations, we are able to analyze the relative motion regulation between a pair of spacecraft flying on arbitrary unperturbed orbits with the same semi-major axis in close formation. Given the initial orbital elements of the leader, this paper presents a simple way to design initial relative orbital elements of close spacecraft with the same semi-major axis, thus preventing collision under non-perturbed conditions. Considering the mean influence of J_2 perturbation, namely secular J_2 perturbation, we derive the mean derivatives of orbital element differences, and then expand them to first order. Thus the first order expansion of orbital element differences can be added to the relative motion equations for further analysis. For a pair of spacecraft that will never collide under non-perturbed situations, we present a simple method to determine whether a collision will occur when J_2 perturbation is considered. Examples are given to prove the validity of the extended relative motion equations and to illustrate how the methods presented can be used. The simple method for designing initial relative orbital elements proposed here could be helpful to the preliminary design of the relative orbital elements between spacecraft in a close formation, when collision avoidance is necessary.
基金supported by the National Natural Science Foundation of China(Grant Nos.11902174,11672146,and 11872223).
文摘An intelligent solution method is proposed to achieve real-time optimal control for continuous-time nonlinear systems using a novel identifier-actor-optimizer(IAO)policy learning architecture.In this IAO-based policy learning approach,a dynamical identifier is developed to approximate the unknown part of system dynamics using deep neural networks(DNNs).Then,an indirect-method-based optimizer is proposed to generate high-quality optimal actions for system control considering both the constraints and performance index.Furthermore,a DNN-based actor is developed to approximate the obtained optimal actions and return good initial guesses to the optimizer.In this way,the traditional optimal control methods and state-of-the-art DNN techniques are combined in the IAO-based optimal policy learning method.Compared to the reinforcement learning algorithms with actor-critic architectures that suffer hard reward design and low computational efficiency,the IAO-based optimal policy learning algorithm enjoys fewer user-defined parameters,higher learning speeds,and steadier convergence properties in solving complex continuous-time optimal control problems(OCPs).Simulation results of three space flight control missions are given to substantiate the effectiveness of this IAO-based policy learning strategy and to illustrate the performance of the developed DNN-based optimal control method for continuous-time OCPs.