摘要
针对战场通信对抗智能决策问题,该文基于整体对抗思想提出一种基于自举专家轨迹分层强化学习的干扰资源分配决策算法(BHJM),算法针对跳频干扰决策难题,按照频点分布划分干扰频段,再基于分层强化学习模型分级决策干扰频段和干扰带宽,最后利用基于自举专家轨迹的经验回放机制采样并训练优化算法,使算法能够在现有干扰资源特别是干扰资源不足的条件下,优先干扰最具威胁目标,获得最优干扰效果同时减少总的干扰带宽。仿真结果表明,算法较现有资源分配决策算法节约25%干扰站资源,减少15%干扰带宽,具有较大实用价值。
Considering the intelligent decision of battlefield communication countermeasure,based on the overall confrontation,a Bootstrapped expert trajectory memory replay-Hierarchical reinforcement learning-Jamming resources distribution decision-Making algorithm(BHJM)is proposed,and the algorithm for frequency hopping jamming decision problem,according to the frequency distribution,jamming spectrum is divided,based on hierarchical reinforcement learning again decision jamming spectrum and bandwidth are divided,and finally based on the bootstrapped expert trajectory memory replay mechanism,the algorithm is optimized,the algorithm can is existing resources,especially under the condition of insufficient resources,give priority to jam the most threat target,obtain the optimal jamming effect and reduce the total jamming bandwidth.The simulation results show that,compared with the existing resource allocation decision algorithms,the proposed algorithm can save 25%of the resources of jammers and 15%of the jamming bandwidth,which is of great practical value.
作者
许华
宋佰霖
蒋磊
饶宁
史蕴豪
XU Hua;SONG Bailin;JIANG Lei;RAO Ning;SHI Yunhao(Information and Navigation College,Air Force Engineering University,Xi’an 710077,China)
出处
《电子与信息学报》
EI
CSCD
北大核心
2021年第11期3086-3095,共10页
Journal of Electronics & Information Technology
关键词
智能干扰决策
分层强化学习
干扰资源分配
专家轨迹
Intelligent interference decision
Hierarchical Reinforcement Learning(HRL)
Jamming resource allocation
Expert trajectory