In this work,the classical single cart-pole dynamic system is extended to the double cart-pole dynamic system with the inclusion of a competing target,which enables the study of multi-agent deep learning problems at a...In this work,the classical single cart-pole dynamic system is extended to the double cart-pole dynamic system with the inclusion of a competing target,which enables the study of multi-agent deep learning problems at an affordable cost.The corresponding important issues,such as system dynamics,reward function and simultaneous training of opponent agents,are discussed in details.To showcase the system dynamics,a couple of agents are trained and the analysis of the competing results reveals the key pattern for winning the competition.It appears that a defensive agent is always defeated by an offensive agent,albeit the associated neural network has a very limited intelligence.When both agents are defensive,the system dynamics will remain stable and achieve the Nash equilibrium.Overall,the proposed dynamic system could serve a surrogate model and assist the study about how to escape the so-called Thucydides trap.展开更多
基金supported by the National Science Foundation of China(Grant No.91852201)。
文摘In this work,the classical single cart-pole dynamic system is extended to the double cart-pole dynamic system with the inclusion of a competing target,which enables the study of multi-agent deep learning problems at an affordable cost.The corresponding important issues,such as system dynamics,reward function and simultaneous training of opponent agents,are discussed in details.To showcase the system dynamics,a couple of agents are trained and the analysis of the competing results reveals the key pattern for winning the competition.It appears that a defensive agent is always defeated by an offensive agent,albeit the associated neural network has a very limited intelligence.When both agents are defensive,the system dynamics will remain stable and achieve the Nash equilibrium.Overall,the proposed dynamic system could serve a surrogate model and assist the study about how to escape the so-called Thucydides trap.