期刊文献+

时变环境下基于最大期望加权估计的干扰决策方法 被引量:2

A Novel Jamming Bandits Based on Maximum Expected Value Weighting Method in Time-varying Environment
下载PDF
导出
摘要 认知雷达对抗技术可使干扰系统具有自主学习能力来实现智能干扰决策。现有基于强化学习理论的干扰决策方法难以在实时性要求高、对抗时间受限、雷达策略快变的雷达对抗环境中获得高期望收益。文中基于多臂匪徒决策理论提出了一种时变环境下基于最大期望加权估计的在线干扰决策方法,通过最大期望加权方法提高了对收益最大臂估计正确率,通过学习时间漂移方法使得干扰决策具有对雷达时变环境的适应性。典型时变环境设置的数值仿真表明,该方法具有在时变环境中更高的决策收益和环境时变适应能力。 Cognitive radar countermeasure technology can be exploited by jamming system to make intelligent decision without prior knowledge.Employing existing jamming strategy based on reinforcement learning theory,desirable benefit cannot be obtained in the radar countermeasures environment where real-time response is required,jamming time is limited and radar strategy changes rapidly.Based on multi-armed bandit(MAB)theory,an online intelligent jamming strategy is proposed in this paper using the maximum expected value weighted(MEVW)estimation method and learning-window shifting(LWS)approach,where MEVW can improve the estimation accuracy about maximal benefit arm,and LWS allow jamming to adapt to time-varying environment.Numerical experiments in typical time-varying environments show that the proposed has higher decision benefits and better adaptability than traditional methods.
作者 王军 叶立诚 刘帅 韩冬梅 WANG Jun;YE Licheng;LIU Shuai;HAN Dongmei(School of Information Science and Engineering,Harbin Institute of Technology at Weihai,Weihai 264209,China;Shandong New Beiyang Information Technology Co,Ltd,Weihai 264203,China)
出处 《现代雷达》 CSCD 北大核心 2021年第3期30-36,共7页 Modern Radar
基金 国家自然科学基金资助课题(62071144)。
关键词 认知雷达对抗 时变环境 干扰决策 多臂匪徒 最大期望加权 cognitive radar countermeasure time-varying environment jamming strategy multi-armed bandit maximum expected value weighting
  • 相关文献

参考文献8

二级参考文献33

  • 1孙宏伟,童宁宁,孙富君.基于D-S证据理论的电子干扰模式选择[J].弹箭与制导学报,2003,23(S2):218-220. 被引量:9
  • 2杜春侠,高云,张文.多智能体系统中具有先验知识的Q学习算法[J].清华大学学报(自然科学版),2005,45(7):981-984. 被引量:21
  • 3高彬,郭庆丰.BP神经网络在电子战效能评估中的应用[J].电光与控制,2007,14(1):69-71. 被引量:21
  • 4王世进,孙晟,周炳海,奚立峰.基于Q-学习的动态单机调度[J].上海交通大学学报,2007,41(8):1227-1232. 被引量:11
  • 5National Institutes of Health, National Institute of Mental Health (NIMH). Definition of cognition[EB/OL].[2015-05-06].http://science-education.nih.gov/supplements/nih5/Mental/other/glossary.htm.
  • 6Li Husheng, Han Zhu. Dogfight in spectrum:combating primary user emulation attacks in cognitive radio systems-part ii:unknown channel statistics[J]. IEEE Transactions on Wireless Communications, 2011,10(1):274-283.
  • 7Bush R R, Mosteller F. Stochastic models for learning[M]. New York:Wiley,1955.
  • 8Minsky M L. Theory of neural analog reinforcement systems and its application to the brain model problem[D]. New Jersey, USA:Princeton University, 1954.
  • 9Watkins J C H, Dayan P. Q-learning[J]. Machine Learning, 1992,8:279-292.
  • 10陈凯.对相控阵雷达的智能干扰决策技术研究[J].西安:西安电子科技大学,2012.

共引文献117

同被引文献12

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部