摘要
针对航天器规避空间目标抵近威胁的决策问题,提出了一种智能决策框架和基于深度强化学习的自主决策方法。考虑到空间目标的机动特性和威胁规避的博弈性,基于感知-判断-决策-执行(OODA)环决策思想和机器学习方法,提出了一种航天器威胁规避智能博弈决策框架。基于该框架和对空间目标运动意图的推理,为了使航天器决策控制具备博弈应对能力,设计了基于深度强化学习的航天器机动决策算法和训练环境,实现了对空间目标典型运动意图的规避应对;进一步地,采用自我博弈学习训练提升航天器自主机动决策算法的泛化性和应对目标不确定机动的适应能力。最后,通过算例仿真及分析,验证了所提方法的有效性。
An intelligent decision-making framework and a deep reinforcement learning-based autonomous decisionmaking method are proposed for the spacecraft decision-making in avoiding the threat of space targets.Taking into account the maneuvering characteristics of space targets and the gameplay of threat avoidance,an intelligent game decision-making framework for spacecraft threat avoidance is proposed based on the Observation-Orientation-DecisionAction(OODA)loop decision-making idea and machine learning techniques.Based on this framework and inference on the motion intentions of space targets,a deep reinforcement learning-based spacecraft maneuver decision-making algorithm and training environment are designed to enable spacecraft decision-making control with game response capability,which realizes the avoidance response to the typical motion intentions of space targets.Furthermore,the generalization of spacecraft autonomous maneuvering decision-making algorithm and its adaptability to possible uncertain maneuvers of space targets are improved by using the self-play learning technique.Finally,the effectiveness of our proposed method is verified through simulations.
作者
张鸿林
罗建军
马卫华
ZHANG Honglin;LUO Jianjun;MA Weihua(School of Astronautics,Northwestern Polytechnical University,Xi'an 710072,China;Science and Technology on Aerospace Flight Dynamics Laboratory,Xi'an 710072,China)
出处
《航空学报》
EI
CAS
CSCD
北大核心
2024年第8期244-259,共16页
Acta Aeronautica et Astronautica Sinica
基金
国家自然科学基金(12072269)
航天飞行动力学技术重点实验室基金(6142210210302)。
关键词
航天器机动
智能决策
威胁规避
OODA环
深度强化学习
spacecraft maneuver
intelligent decision-making
threat avoidance
OODA loop
deep reinforcement learning