期刊文献+

基于深度强化学习的干扰探测共享信号设计

Design of Jamming-Detection Shared Signal Based on Deep Reinforcement Learning
下载PDF
导出
摘要 针对当前雷达电子战越来越向着智能化的方向发展、传统干扰机无法适应环境变化、极大地降低了作战效果等问题,考虑将探测信号隐藏在干扰信号中,实现干扰探测共享信号,使侦察干扰机设备发射的干扰信号兼具探测的效果;针对当前干扰探测共享信号中存在的复杂度低、频谱宽度较窄等问题,设计了一种基于多载频多相位编码(multi-carrier phase code,MCPC)的干扰探测共享信号,其具有良好的类噪声宽频谱特性以及较好的距离探测能力和速度探测能力,可以在对目标雷达实现压制干扰的同时对目标信号及周围环境进行隐蔽探测;为了使共享信号能够适应对战场环境的感知与博弈,进一步引入深度强化学习算法对MCPC干扰探测共享信号进行优化;首先在竞争深度Q学习网络(dueling deep Q-learning network,Du DQN)的基础上对Q值进行正则化,解决了Du DQN中易出现的由过估计导致的局部最优问题;其次,在奖励值中引入状态价值函数形成复合奖励值,将其称为复合奖励值竞争深度正则化Q学习网络(composite reward-dueling deep Q-learning network based on regularization,CR-Du DQNReg),使MCPC共享信号对奖励值的敏感度随自身状态调整,自适应优化相位编码初值,达到更好的干扰和隐蔽探测的效果.实验仿真结果表明:经CR-DuDQNReg算法优化后的MCPC共享信号频谱最高幅度提升17.48%,脉压最高幅度提升17.25%,多普勒模糊函数第1旁瓣幅度降低12.69%,且与传统深度强化学习算法相比,CR-Du DQNReg算法的优化效果更好. Owing to the increasing intelligence of radar electronic warfare systems,traditional jammers cannot adapt to the changes in the environment,which greatly reduces their effectiveness.The detection signal can be hidden in the jamming signal to construct a jamming-detection shared signal so that the jamming signal sent by the reconnaissance jammer equipment has a detection effect.In this paper,a jamming-detection shared signal based on a multi-carrier phase code(MCPC)is designed to solve the problems of low complexity and narrow spectrum width of the current jamming-detection shared signal.This signal features good noise-like wide spectrum characteristics,good distance detection capacities,and good speed detection capacities.Moreover,it can suppress the jamming on the target radar and covertly detect the target signal and surrounding environment.To adapt the shared signal to the perception and activity of the battlefield environment,a deep reinforcement learning algorithm is introduced to optimize the shared signal of MCPC.The Q-value is first regularized using the dueling deep Q-learning network,which solves the local optimization problem caused by the overestimation in the network.A state value function is then introduced into the reward value to form a composite reward,which is referred to as the composite reward-dueling deep Q-learning network based on regulation(CR-DuDQNReg).The sensitivity of the MCPC shared signals to the reward value can then be adjusted according to the signal’s own state,and the initial phase code value can be adaptively optimized to suppress interference and improve covert detection.The experimental results showed that the maximum spectrum amplitude of the MCPC signal optimized using CR-DuDQNReg was increased by 17.48%,the maximum pulse compression amplitude was increased by 17.25%,the first side lobe amplitude of the Doppler ambiguity function was reduced by 12.69%,and the optimization effect was better than that of the traditional deep reinforcement learning algorithm.
作者 肖易寒 刘禹汐 于祥祯 赵忠凯 Xiao Yihan;Liu Yuxi;Yu Xiangzhen;Zhao Zhongkai(College of Information and Communication Engineering,Harbin Engineering University,Harbin 150000,China;Key Laboratory of Advanced Marine Communication and Information Technology,Ministry of Industry and Information Technology,Harbin 150000,China;Shanghai Radio Equipment Research Institute,Shanghai 201100,China)
出处 《天津大学学报(自然科学与工程技术版)》 EI CAS CSCD 北大核心 2023年第12期1326-1336,共11页 Journal of Tianjin University:Science and Technology
基金 国防科技基础加强计划资助项目(2019-JCJQ-ZD-067-00) 中央高校基本科研业务费专项资金资助项目(3072022CF0802)。
关键词 干扰探测共享信号 多载频多相位编码 深度强化学习 复合奖励值 jamming-detection shared signal multi-carrier phase code deep reinforcement learning composite reward
  • 相关文献

参考文献14

二级参考文献105

共引文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部