摘要
针对传统干扰资源分配算法在处理非线性组合优化问题时需要较完备的先验信息,同时决策维度小,无法满足现代通信对抗要求的问题,该文提出一种融合噪声网络的深度强化学习通信干扰资源分配算法(FNNDRL)。借鉴噪声网络的思想,该算法设计了孪生噪声评估网络,在避免Q值高估的基础上,通过提升评估网络的随机性,保证了训练过程的探索性;基于概率熵的物理意义,设计了基于策略分布熵改进的策略网络损失函数,在最大化累计奖励的同时最大化策略分布熵,避免策略优化过程中收敛到局部最优。仿真结果表明,该算法在解决干扰资源分配问题时优于所对比的平均分配和强化学习方法,同时算法稳定性较高,对高维决策空间适应性强。
To solve the problem that the traditional jamming resource allocation algorithm needs relatively complete prior information when dealing with nonlinear combinatorial optimization problems,and meanwhile,the decision dimension is small,which can not meet the requirements of modern communication countermeasures,a Deep Reinforcement Learning communication jamming resource allocation algorithm Fused with Noise Network(FNNDRL)is proposed.Using the idea of noise network for reference,twin noise evaluation network,which can avoid the overestimation of Q value and improve the randomness of evaluation network to ensure the exploration of training process is designed by the algorithm.Based on the physical significance of the probability entropy,an improved strategy network loss function based on the strategy distribution entropy is designed to maximize the cumulative reward and the strategy distribution entropy to avoid convergence to local optimal in the process of strategy optimization.The simulation results show that the proposed algorithm is superior to the average allocation and reinforcement learning methods in solving the problem of jamming resource allocation.Meanwhile,the algorithm has high stability and strong adaptability to high-dimensional decision space.
作者
彭翔
许华
蒋磊
饶宁
宋佰霖
PENG Xiang;XU Hua;JIANG Lei;RAO Ning;SONG Bailin(Information and Navigation College,Air Force Engineering University,Xi’an 710077,China)
出处
《电子与信息学报》
EI
CSCD
北大核心
2023年第3期1043-1054,共12页
Journal of Electronics & Information Technology
关键词
干扰资源分配
深度强化学习
噪声网络
策略分布熵
Jamming resource allocation
Deep Reinforcement Learning(DRL)
Noise network
Entropy of strategy distribution