期刊文献+

一种融合噪声网络的深度强化学习通信干扰资源分配算法 被引量:1

A Deep Reinforcement Learning Communication Jamming Resource Allocation Algorithm Fused with Noise Network
下载PDF
导出
摘要 针对传统干扰资源分配算法在处理非线性组合优化问题时需要较完备的先验信息,同时决策维度小,无法满足现代通信对抗要求的问题,该文提出一种融合噪声网络的深度强化学习通信干扰资源分配算法(FNNDRL)。借鉴噪声网络的思想,该算法设计了孪生噪声评估网络,在避免Q值高估的基础上,通过提升评估网络的随机性,保证了训练过程的探索性;基于概率熵的物理意义,设计了基于策略分布熵改进的策略网络损失函数,在最大化累计奖励的同时最大化策略分布熵,避免策略优化过程中收敛到局部最优。仿真结果表明,该算法在解决干扰资源分配问题时优于所对比的平均分配和强化学习方法,同时算法稳定性较高,对高维决策空间适应性强。 To solve the problem that the traditional jamming resource allocation algorithm needs relatively complete prior information when dealing with nonlinear combinatorial optimization problems,and meanwhile,the decision dimension is small,which can not meet the requirements of modern communication countermeasures,a Deep Reinforcement Learning communication jamming resource allocation algorithm Fused with Noise Network(FNNDRL)is proposed.Using the idea of noise network for reference,twin noise evaluation network,which can avoid the overestimation of Q value and improve the randomness of evaluation network to ensure the exploration of training process is designed by the algorithm.Based on the physical significance of the probability entropy,an improved strategy network loss function based on the strategy distribution entropy is designed to maximize the cumulative reward and the strategy distribution entropy to avoid convergence to local optimal in the process of strategy optimization.The simulation results show that the proposed algorithm is superior to the average allocation and reinforcement learning methods in solving the problem of jamming resource allocation.Meanwhile,the algorithm has high stability and strong adaptability to high-dimensional decision space.
作者 彭翔 许华 蒋磊 饶宁 宋佰霖 PENG Xiang;XU Hua;JIANG Lei;RAO Ning;SONG Bailin(Information and Navigation College,Air Force Engineering University,Xi’an 710077,China)
出处 《电子与信息学报》 EI CSCD 北大核心 2023年第3期1043-1054,共12页 Journal of Electronics & Information Technology
关键词 干扰资源分配 深度强化学习 噪声网络 策略分布熵 Jamming resource allocation Deep Reinforcement Learning(DRL) Noise network Entropy of strategy distribution
  • 相关文献

参考文献5

二级参考文献32

  • 1吕永胜,王树宗,王向伟,王江枫.基于贴近度的雷达干扰资源分配策略研究[J].系统工程与电子技术,2005,27(11):1893-1894. 被引量:36
  • 2李昌锦,陈永光,沈阳,李修和.突防过程的组网雷达干扰资源优化分配[J].火力与指挥控制,2006,31(10):8-10. 被引量:9
  • 3沈阳,陈永光,李修和.基于0-1规划的雷达干扰资源优化分配研究[J].兵工学报,2007,28(5):528-532. 被引量:45
  • 4ZHAI X F and ZHUANG Y.IIGA based algorithm for cooperative jamming resource allocation[C].Asia Pacific Conference on Postgraduate Research,Shanghai,China,2009:368-371.
  • 5XUE Y,ZHUANG Y,NI T Q,et al.One improved genetic algorithm applied in the problem of dynamic jam resource scheduling with multi-objective and multi-constraint[C].IEEE 5th International Conference on Bio-inspired Computing:Theories and Applications,Shanghai,China,2010:708-712.
  • 6YANG X S and DEB S.Cuckoo search via levy flights[C].Proceedings of IEEE World Congress on Nature & Biological Inspired Computing,India,2009:210-214.
  • 7YANG X S and DEB S.Multi objective cuckoo search for design optimization[J].Computers & Operations Research,2011,10(9):1-9.
  • 8ZHENG H Q and ZHOU Y Q.A discrete binary version of cuckoo search for knapsack problems[J].Advances in Information Science and Service Sciences,2012,4(18):331-339.
  • 9OUYANG X X,ZHOU Y Q,LUO Q F,et al.A novel discrete cuckoo search algorithm for spherical traveling salesman problem[J].Applied Mathematical & Information Sciences,2013,7(2):777-784.
  • 10KENNEDY J and EBERHART R C.A discrete version of the particle swarm algorithm[C].IEEE International Conference on Systems,Man,and Cybernetics,Piscataway,1997:4104-4109.

共引文献51

同被引文献53

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部