期刊文献+

基于PPO算法的无人机近距空战自主引导方法 被引量:1

An Autonomous Guidance Method of UAV in Close Air Combat Based on PPO Algorithm
下载PDF
导出
摘要 针对无人机近距空战的自主决策问题,提出了一种基于近端策略优化(PPO)算法的无人机自主引导方法。针对敌我距离、角度、速度以及任务约束等信息重塑奖励,建立了无人机三自由度模型,在速度坐标系上构建强化学习的状态和动作,分别对结合了全连接神经网络的PPO算法(标准PPO算法)和长短时记忆网络的PPO算法(改进PPO算法)模型进行了仿真训练。根据训练的结果可以证明,相比于标准PPO算法,所提的改进PPO算法能够更有效地处理与时间序列高度相关的无人机自主引导任务。 Aiming at the problem of UAV’s autonomous decision-making in close air combat,an autonomous guidance method for UAV based on Proximal Policy Optimization(PPO) algorithm is proposed.The rewards are reshaped,such as distance,angle,speed and mission constraint,a three-degree-of-freedom model of UAV is established,and the state and action of reinforcement learning are constructed on the velocity coordinate system.The simulation training is carried out on the model of PPO algorithm combined with the fully connected neural network(standard PPO algorithm) and the PPO algorithm combined with the long short-term memory network(improved PPO algorithm) respecitively.According to the training results,it can be proved that,compared with the standard PPO algorithm,the improved PPO algorithm proposed in this paper can handle the UAV autonomous guidance tasks that are highly correlated with time series more effectively.
作者 邱妍 赵宝奇 邹杰 刘仲凯 QIU Yan;ZHAO Baoqi;ZOU Jie;LIU Zhongkai(Science and Technology on Electro-Optical Control Laboratory,Luoyang 471000 China;Luoyang Institute of Electro-Optical Equipment AVIC,Luoyang 471000 China;The Second Representative Office of Air Force Armament Department in Luoyang,Luoyang 471000 China)
出处 《电光与控制》 CSCD 北大核心 2023年第1期8-14,共7页 Electronics Optics & Control
基金 航空科学基金(2020Z015013001)。
关键词 近距空战 近端策略优化 自主引导 长短时记忆网络 close air combat Proximal Policy Optimization(PPO) autonomous guidance long short-term memory network
  • 相关文献

参考文献6

二级参考文献61

  • 1陈勇,刘勇,鲍胜利.基于伪并行遗传算法的路径测试数据自动生成[J].四川大学学报(工程科学版),2009,41(5):141-145. 被引量:5
  • 2傅莉,王晓光.无人战机近距空战微分对策建模研究[J].兵工学报,2012,33(10):1210-1216. 被引量:20
  • 3罗德林,沈春林,吴文海,吴顺祥.空战格斗决策研究[J].应用科学学报,2006,24(1):89-93. 被引量:13
  • 4Austin F,Carbone G,Falco M.Game Theory for Automated Maneuvering During Air-to-Air Combat[J].Journal of Guidance (S0306-9885),1990,13(6):1143-1149.
  • 5Clemen R T.An Introduction to Decision Analysis[M].Belmont:Duxbury Press,1996.
  • 6Virtanen K,Raivio T,Hamalainen R P.Modeling Pilot's Sequential Maneuvering Decisions by a Multistage Influence Diagram[R].AIAA-2001-4267,2001.
  • 7Basar T,Olsder G.Dynamic Non-cooperative Game Theory[M].London:Academic Press,1995.
  • 8Atkison K E.An Introduction to Numerical Analysis[M].New York:Wiley,1978.
  • 9Lazarus E.The Application of Value-Driven Decision-Making in Air Combat Simulation[C]// Proceedings of the IEEE International Conference on Systems,Man,and Cybernetics,1997,2302-2307.
  • 10Virtanen K,Raivio T,Hamalainen R P.Decision Theoretical Approach to Pilot Simulation[J].Journal of Aircraft (S0021-8669),1999,36(4):632-641.

共引文献73

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部