摘要
面向毁伤和目标价值动态变化条件下的大规模武器目标动态分配问题,提出了一种基于深度强化学习的武器目标分配求解方法。该方法采用双神经网络结构,基于武器目标分配目标函数设计了一套简单、直观的状态与奖励建模方法。通过仿真实验对所提方法进行了验证,结果表明,所提方法能够较快实现收敛,且整体毁伤和计算效率上优于基于粒子群的方法。所提方法能够有效应对毁伤概率和目标价值动态变化条件下的武器目标分配问题,说明了其良好的拓展性。该方法可应用于作战任务规划、仿真单元自动交火等场景下的武器目标分配快速求解。
The paper proposed a deep Q-learning method to solve the weapon target assignment problem with dynamic damage rates and target values.The DQN method adopted a double network structure.A straightforward method is proposed for modeling the state and reward function of the DQN,which is designed based on the objective function of the weapon target assignment problem.The proposed DQN model was tested by using several simulated scenarios.Results showed the model can converge fast and effectively solve the weapon target assignment problem with dynamic damage rates and target values.The proposed method achieved a better total damage rate and less computation time than the particle swarm-based method.The proposed method can be applied to weapon target assignment problem under the scenario of combat mission planning and combat simulation.
作者
林雕
朱燕
杨剑
LIN Diao;ZHU Yan;YANG Jian(Army Command College,Nanjing 210045,China;Unit 61175 of PLA,Nanjing 210046,China;Information Engineering University,Zhengzhou 450052,China)
出处
《火力与指挥控制》
CSCD
北大核心
2024年第2期138-143,共6页
Fire Control & Command Control
关键词
武器目标分配
深度强化学习
动态毁伤
动态目标价值
weapon target assignment
deep reinforcement learning
dynamic damage rate
dynamic target value