As one of the major contributions of biology to competitive decision making, evolutionary game theory provides a useful tool for studying the evolution of cooperation. To achieve the optimal solution for unmanned aeri...As one of the major contributions of biology to competitive decision making, evolutionary game theory provides a useful tool for studying the evolution of cooperation. To achieve the optimal solution for unmanned aerial vehicles (UAVs) that are car- rying out a sensing task, this paper presents a Markov decision evolutionary game (MDEG) based learning algorithm. Each in- dividual in the algorithm follows a Markov decision strategy to maximize its payoff against the well known Tit-for-Tat strate- gy. Simulation results demonstrate that the MDEG theory based approach effectively improves the collective payoff of the roam. The proposed algorithm can not only obtain the best action sequence but also a sub-optimal Markov policy that is inde- pendent of the game duration. Furthermore, the paper also studies the emergence of cooperation in the evolution of self-regarded UAVs. The results show that it is the adaptive ability of the MDEG based approach as well as the perfect balance between revenge and forgiveness of the Tit-for-Tat strategy that the emergence of cooperation should be attributed to.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.61425008,61333004 and 61273054)Top-Notch Young Talents Program of China,and Aeronautical Foundation of China(Grant No.20135851042)
文摘As one of the major contributions of biology to competitive decision making, evolutionary game theory provides a useful tool for studying the evolution of cooperation. To achieve the optimal solution for unmanned aerial vehicles (UAVs) that are car- rying out a sensing task, this paper presents a Markov decision evolutionary game (MDEG) based learning algorithm. Each in- dividual in the algorithm follows a Markov decision strategy to maximize its payoff against the well known Tit-for-Tat strate- gy. Simulation results demonstrate that the MDEG theory based approach effectively improves the collective payoff of the roam. The proposed algorithm can not only obtain the best action sequence but also a sub-optimal Markov policy that is inde- pendent of the game duration. Furthermore, the paper also studies the emergence of cooperation in the evolution of self-regarded UAVs. The results show that it is the adaptive ability of the MDEG based approach as well as the perfect balance between revenge and forgiveness of the Tit-for-Tat strategy that the emergence of cooperation should be attributed to.