To address the guided search task of airborne phased array radar in the scenarios of large airspace with widespread distribution of cluster targets in Beyond Visual Range(BVR)air combat,a hierarchical strategy framewo...To address the guided search task of airborne phased array radar in the scenarios of large airspace with widespread distribution of cluster targets in Beyond Visual Range(BVR)air combat,a hierarchical strategy framework based on deep reinforcement learning is proposed to guide different stages of search tasks.Firstly,an airspace set-covering model and a radar parameter optimization model for the guided search task of cluster targets are established.Secondly,the hierarchical strategy framework including upper-level and lower-level strategies is constructed based on the above models.Finally,the happo-rgs algorithm is proposed for feature extraction from Markov continuous observation sequences,to enhance the training effectiveness and improve the algorithm convergence speed.Simulation results show that the trained agent can make precise autonomous decisions rapidly based on airspace-target covering situation and target guidance information which significantly improves the radar search performance in the forementioned scenarios compared to traditional algorithms.展开更多
基金supported by the Open Research Subject of State Key Laboratory of Intelligent Game,China(No.ZBKF-23-04)。
文摘To address the guided search task of airborne phased array radar in the scenarios of large airspace with widespread distribution of cluster targets in Beyond Visual Range(BVR)air combat,a hierarchical strategy framework based on deep reinforcement learning is proposed to guide different stages of search tasks.Firstly,an airspace set-covering model and a radar parameter optimization model for the guided search task of cluster targets are established.Secondly,the hierarchical strategy framework including upper-level and lower-level strategies is constructed based on the above models.Finally,the happo-rgs algorithm is proposed for feature extraction from Markov continuous observation sequences,to enhance the training effectiveness and improve the algorithm convergence speed.Simulation results show that the trained agent can make precise autonomous decisions rapidly based on airspace-target covering situation and target guidance information which significantly improves the radar search performance in the forementioned scenarios compared to traditional algorithms.