摘要
动态武器目标分配(Weapon Target Assignment,WTA)中的目标选择策略问题可以通过建立马尔可夫决策过程(Markov decision processes,MDP)模型进行研究,但目前尚无有效求解此类较大规模的MDP问题中最优策略的算法.通过分析动态WTA问题的MDP模型特点,给出了求解该问题最优策略的改进算法.该算法主要在初始策略选取规则、策略改进规则以及最优策略的判断准则等方面进行了改进.该算法具有计算量小,节省内存,并可得到最优解等优点.最后,通过算例将该算法与传统算法进行了比较.改进算法可以用于解决较大规模的动态WTA中的策略优化问题.
The policies optimization problem of dynamic weapon target assignment (WTA) could be modeled with Markov decision processes (MDP); however, there have been no effective algorithms to solve the optimal policies of such large-scale problems by now. The characteristics of the MDP are analyzed, and the improved algorithm to solve optimal policies of the problem is proposed correspondingly. The algorithm is mainly improved in the selection rnle of initial policy, the improvement rnle of policy and the evaluation criterion of optimal policies, so both the storage space and computing time are reduced. Meanwhile the optimal solution of the MDP problem could be obtained by the improved algorithm. Finally, a simple comparison between the improved algorithm and conventional algorithm is given through an example. It can be concluded that the improvement algorithm is suitable to solve large-scale problems such as the policies optimization problem of dynamic WTA.
出处
《系统工程理论与实践》
EI
CSCD
北大核心
2007年第7期160-165,共6页
Systems Engineering-Theory & Practice
基金
国防预研基金(404010101)
关键词
运筹学
动态武器目标分配
算法
策略优化
马尔可夫决策过程
operations research
dynamic weapon target assignment
algorithm
policy optimization
Markovdecision process (MDP)
mathematical model