As an efficient method of solving subgame-perfect Nash equilibrium,the backward induction is analyzed from an evolutionary point of view in this paper,replacing a player with a population and turning a game into a pop...As an efficient method of solving subgame-perfect Nash equilibrium,the backward induction is analyzed from an evolutionary point of view in this paper,replacing a player with a population and turning a game into a population game,which shows that equilibrium of a perfect information game is the unique evolutionarily stable outcome for dynamic models in the limit.展开更多
In evolutionary games,most studies on finite populations have focused on a single updating mechanism.However,given the differences in individual cognition,individuals may change their strategies according to different...In evolutionary games,most studies on finite populations have focused on a single updating mechanism.However,given the differences in individual cognition,individuals may change their strategies according to different updating mechanisms.For this reason,we consider two different aspiration-driven updating mechanisms in structured populations:satisfied-stay unsatisfied shift(SSUS)and satisfied-cooperate unsatisfied defect(SCUD).To simulate the game player’s learning process,this paper improves the particle swarm optimization algorithm,which will be used to simulate the game player’s strategy selection,i.e.,population particle swarm optimization(PPSO)algorithms.We find that in the prisoner’s dilemma,the conditions that SSUS facilitates the evolution of cooperation do not enable cooperation to emerge.In contrast,SCUD conditions that promote the evolution of cooperation enable cooperation to emerge.In addition,the invasion of SCUD individuals helps promote cooperation among SSUS individuals.Simulated by the PPSO algorithm,the theoretical approximation results are found to be consistent with the trend of change in the simulation results.展开更多
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea...Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms.展开更多
We study large population stochastic dynamic games where the so-called Nash certainty equivalence based control laws are implemented by the individual players. We first show a martingale property for the limiting cont...We study large population stochastic dynamic games where the so-called Nash certainty equivalence based control laws are implemented by the individual players. We first show a martingale property for the limiting control problem of a single agent and then perform averaging across the population; this procedure leads to a constant value for the martingale which shows an invariance property of the population behavior induced by the Nash strategies.展开更多
The problem of incentive equilibria for dynamic games proposed in Ref.1 is considered.A sufficient condition and a necessary condition concerning the existence of incentive equilibria are proposed. As an illustrative ...The problem of incentive equilibria for dynamic games proposed in Ref.1 is considered.A sufficient condition and a necessary condition concerning the existence of incentive equilibria are proposed. As an illustrative example, the incentive equilibria for linear-quadratic games are discussed.展开更多
The transformation of characteristic functions is an effective way to avoid time-inconsistency of cooperative solutions in dynamic games.There are several forms on the transformation of characteristic functions.In thi...The transformation of characteristic functions is an effective way to avoid time-inconsistency of cooperative solutions in dynamic games.There are several forms on the transformation of characteristic functions.In this paper,a class of general transformation of characteristic functions is proposed.It can lead to the time-consistency of cooperative solutions and guarantee that the irrational-behaviorproof conditions hold true.To illustrate the theory,an example of dynamic game on a tree is given.展开更多
Cooperative autonomous air combat of multiple unmanned aerial vehicles(UAVs)is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain info...Cooperative autonomous air combat of multiple unmanned aerial vehicles(UAVs)is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain information of the opponents.As such,this paper presents a cooperative decision-making method based on incomplete information dynamic game to generate maneuver strategies for multiple UAVs in air combat.Firstly,a cooperative situation assessment model is presented to measure the overall combat situation.Secondly,an incomplete information dynamic game model is proposed to model the dynamic process of air combat,and a dynamic Bayesian network is designed to infer the tactical intention of the opponent.Then a reinforcement learning framework based on multiagent deep deterministic policy gradient is established to obtain the perfect Bayes-Nash equilibrium solution of the air combat game model.Finally,a series of simulations are conducted to verify the effectiveness of the proposed method,and the simulation results show effective synergies and cooperative tactics.展开更多
A certain constrained dynamic game is shown to be equivalent to a pair of symmetric dual variational problems which have more general formulation than those already existing in the literature. Various duality results ...A certain constrained dynamic game is shown to be equivalent to a pair of symmetric dual variational problems which have more general formulation than those already existing in the literature. Various duality results are proved under convexity and generalized convexity assumptions on the appropriate functionals. The dynamic game is also viewed as equivalent to a pair of dual variational problems without the condition of fixed points. It is also indicated that the equivalent formulation of a pair of symmetric dual variational problems as dynamic generalization of those had been already studied in the literature. In essence, the purpose of the research is to establish that the solution of variational problems yields the solution of the dynamic game.展开更多
A power source–power grid coordinated typhoon defense strategy is proposed in this study to minimize the cost of power grid anti-typhoon reinforcement measures and improve defense efficiency.It is based on multiagent...A power source–power grid coordinated typhoon defense strategy is proposed in this study to minimize the cost of power grid anti-typhoon reinforcement measures and improve defense efficiency.It is based on multiagent dynamic game theory.This strategy regards a typhoon as a rational gamer that always causes the greatest damage.Together with the grid planner and black start unit(BSU)planner,it forms a multiagent defense–attack–defense dynamic game model naturally.The model is adopted to determine the optimal reinforcements for the transmission lines,black start power capacity,and location.Typhoon Hato,which struck a partial coastal area in Guangdong province in China in 2017,was adopted to formulate a step-by-step model of a typhoon attacking coastal area power systems.The results were substituted into the multiagent defense–attack–defense dynamic game model to obtain the optimal transmission line reinforcement positions,as well as optimal BSU capacity and geographic positions.An effective typhoon defense strategy and minimum load shedding were achieved,demonstrating the feasibility and correctness of the proposed strategy.The related theories and methods of this study have positive significance for the prevention of uncertain large-scale natural disasters.展开更多
The goal of delivering high-quality service has spurred research of 6G satellite communication networks.The limited resource-allocation problem has been addressed by next-generation satellite communication networks,es...The goal of delivering high-quality service has spurred research of 6G satellite communication networks.The limited resource-allocation problem has been addressed by next-generation satellite communication networks,especially multilayer networks with multiple low-Earth-orbit(LEO)and nonlow-Earth-orbit(NLEO)satellites.In this study,the resource-allocation problem of a multilayer satellite network consisting of one NLEO and multiple LEO satellites is solved.The NLEO satellite is the authorized user of spectrum resources and the LEO satellites are unauthorized users.The resource allocation and dynamic pricing problems are combined,and a dynamic gamebased resource pricing and allocation model is proposed to maximize the market advantage of LEO satellites and reduce interference between LEO and NLEO satellites.In the proposed model,the resource price is formulated as the dynamic state of the LEO satellites,using the resource allocation strategy as the control variable.Based on the proposed dynamic game model,an openloop Nash equilibrium is analyzed,and an algorithm is proposed for the resource pricing and allocation problem.Numerical simulations validate the model and algorithm.展开更多
As to oppositional, multi-objective and hierarchical characteristic of air formation to ground attackdefends campaign, and using dynamic space state model of military campaign, this article establishes a principal and...As to oppositional, multi-objective and hierarchical characteristic of air formation to ground attackdefends campaign, and using dynamic space state model of military campaign, this article establishes a principal and subordinate hierarchical interactive decision-making way, the Nash-Stackelberg-Nash model, to solve the problems in military operation, and find out the associated best strategy in hierarchical dynamic decision-making. The simulating result indicate that when applying the model to air formation to ground attack-defends decision-making system, it can solve the problems of two hierarchies, dynamic oppositional decision-making favorably, and reach preferable effect in battle. It proves that the model can provide an effective way for analyzing a battle,展开更多
Using economics and game theory, two kinds of models have been proposed in this paper under the assumption that foreign and domestic firms behave under the condition of dynamic game of perfect information. One model i...Using economics and game theory, two kinds of models have been proposed in this paper under the assumption that foreign and domestic firms behave under the condition of dynamic game of perfect information. One model is for calculating Anti-dumping rate which is obtained according to current regulations of Anti-dumping, but it is not optimal. The other is an optimal model of Anti-dumping which is obtained according to the maximum principle of domestic social welfare. Then, through the comparison of this two models in detail, several shortages have been revealed about Anti-dumping rate model based on current regulations of Anti-dumping. Finally, a suggestion is indicated that WTO and China should use the optimal model to calculate Anti-dumping rate.展开更多
In this paper,the irrational-behavior-proof conditions in a class of stochastic dynamic games over event trees are presented.Four kinds of irrational-behavior-proof conditions are proposed by the imputation distributi...In this paper,the irrational-behavior-proof conditions in a class of stochastic dynamic games over event trees are presented.Four kinds of irrational-behavior-proof conditions are proposed by the imputation distribution procedure,and their relationships are discussed.More specific properties for the general transformation of characteristic functions are developed,based on which,the irrational-behavior-proof conditions are proved to be true in a transformed cooperative game.展开更多
Potential games are noncooperative games for which there exist auxiliary functions, called potentials,such that the maximizers of the potential are also Nash equilibria of the corresponding game. Some properties of Na...Potential games are noncooperative games for which there exist auxiliary functions, called potentials,such that the maximizers of the potential are also Nash equilibria of the corresponding game. Some properties of Nash equilibria, such as existence or stability, can be derived from the potential, whenever it exists. We survey different classes of potential games in the static and dynamic cases, with a finite number of players, as well as in population games where a continuum of players is allowed. Likewise, theoretical concepts and applications are discussed by means of illustrative examples.展开更多
This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize t...This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize to the state of a command multi-agent dynamical systems, where pinning control is used generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents' dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.展开更多
The solvability of the coupled Riccati differential equations appearing in the differential game approach to the formation control problem is vital to the finite horizon Nash equilibrium solution.These equations(if so...The solvability of the coupled Riccati differential equations appearing in the differential game approach to the formation control problem is vital to the finite horizon Nash equilibrium solution.These equations(if solvable)can be solved numerically by using the terminal value and the backward iteration.To investigate the solvability and solution of these equations the formation control problem as the differential game is replaced by a discrete-time dynamic game.The main contributions of this paper are as follows.First,the existence of Nash equilibrium controls for the discretetime formation control problem is shown.Second,a backward iteration approximate solution to the coupled Riccati differential equations in the continuous-time differential game is developed.An illustrative example is given to justify the models and solution.展开更多
The spoofing capability of Global Navigation Satellite System(GNSS)represents an important confrontational capability for navigation security,and the success of planned missions may depend on the effective evaluation ...The spoofing capability of Global Navigation Satellite System(GNSS)represents an important confrontational capability for navigation security,and the success of planned missions may depend on the effective evaluation of spoofing capability.However,current evaluation systems face challenges arising from the irrationality of previous weighting methods,inapplicability of the conventional multi-attribute decision-making method and uncertainty existing in evaluation.To solve these difficulties,considering the validity of the obtained results,an evaluation method based on the game aggregated weight model and a joint approach involving the grey relational analysis and technique for order preference by similarity to an ideal solution(GRA-TOPSIS)are firstly proposed to determine the optimal scheme.Static and dynamic evaluation results under different schemes are then obtained via a fuzzy comprehensive assessment and an improved dynamic game method,to prioritize the deceptive efficacy of the equipment accurately and make pointed improvement for its core performance.The use of judging indicators,including Spearman rank correlation coefficient and so on,combined with obtained evaluation results,demonstrates the superiority of the proposed method and the optimal scheme by the horizontal comparison of different methods and vertical comparison of evaluation results.Finally,the results of field measurements and simulation tests show that the proposed method can better overcome the difficulties of existing methods and realize the effective evaluation.展开更多
It is expected that multiple virtual power plants(multi-VPPs)will join and participate in the future local energy market(LEM).The trading behaviors of these VPPs needs to be carefully studied in order to maximize the ...It is expected that multiple virtual power plants(multi-VPPs)will join and participate in the future local energy market(LEM).The trading behaviors of these VPPs needs to be carefully studied in order to maximize the benefits brought to the local energy market operator(LEMO)and each VPP.We propose a bounded rationality-based trading model of multiVPPs in the local energy market by using a dynamic game approach with different trading targets.Three types of power bidding models for VPPs are first set up with different trading targets.In the dynamic game process,VPPs can also improve the degree of rationality and then find the most suitable target for different requirements by evolutionary learning after considering the opponents’bidding strategies and its own clustered resources.LEMO would decide the electricity buying/selling price in the LEM.Furthermore,the proposed dynamic game model is solved by a hybrid method consisting of an improved particle swarm optimization(IPSO)algorithm and conventional largescale optimization.Finally,case studies are conducted to show the performance of the proposed model and solution approach,which may provide some insights for VPPs to participate in the LEM in real-world complex scenarios.展开更多
Based on the basis of the two stage dynamic game of complete information and purely tactful perfect equilibrium theory, the non cooperative gaming between the police department and the criminals is analyzed. Dyn...Based on the basis of the two stage dynamic game of complete information and purely tactful perfect equilibrium theory, the non cooperative gaming between the police department and the criminals is analyzed. Dynamic game can be proved to forecast and explain potential tactful choices of the police department and the criminals at various stages, so as to analyze the essence of the law enforcement by the theoretical models.展开更多
In managing an international project, claims are very important. In this paper, a complete information dynamic game model is designed; with the Nash equilibrium values, the huge influence of claim cost on claim strate...In managing an international project, claims are very important. In this paper, a complete information dynamic game model is designed; with the Nash equilibrium values, the huge influence of claim cost on claim strategy is testified and the importance of claims to both sides of a contract especially the contractor is elucidated. Claim chances are also discussed with game theory. At last, from the angle of a repeated game and by comparison with Pareto optimization and Nash equilibrium values, it is concluded that the best payoff can be obtained with a honest attitude and through cooperation between companies.展开更多
文摘As an efficient method of solving subgame-perfect Nash equilibrium,the backward induction is analyzed from an evolutionary point of view in this paper,replacing a player with a population and turning a game into a population game,which shows that equilibrium of a perfect information game is the unique evolutionarily stable outcome for dynamic models in the limit.
基金Project supported by the Doctoral Foundation Project of Guizhou University(Grant No.(2019)49)the National Natural Science Foundation of China(Grant No.71961003)the Science and Technology Program of Guizhou Province(Grant No.7223)。
文摘In evolutionary games,most studies on finite populations have focused on a single updating mechanism.However,given the differences in individual cognition,individuals may change their strategies according to different updating mechanisms.For this reason,we consider two different aspiration-driven updating mechanisms in structured populations:satisfied-stay unsatisfied shift(SSUS)and satisfied-cooperate unsatisfied defect(SCUD).To simulate the game player’s learning process,this paper improves the particle swarm optimization algorithm,which will be used to simulate the game player’s strategy selection,i.e.,population particle swarm optimization(PPSO)algorithms.We find that in the prisoner’s dilemma,the conditions that SSUS facilitates the evolution of cooperation do not enable cooperation to emerge.In contrast,SCUD conditions that promote the evolution of cooperation enable cooperation to emerge.In addition,the invasion of SCUD individuals helps promote cooperation among SSUS individuals.Simulated by the PPSO algorithm,the theoretical approximation results are found to be consistent with the trend of change in the simulation results.
基金National Key Research and Development Program of China(2017YFB1002503)Science and Technology Innovation 2030-“New Generation Artificial Intelligence”Major Project(2018AAA0100902),China.
文摘Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms.
文摘We study large population stochastic dynamic games where the so-called Nash certainty equivalence based control laws are implemented by the individual players. We first show a martingale property for the limiting control problem of a single agent and then perform averaging across the population; this procedure leads to a constant value for the martingale which shows an invariance property of the population behavior induced by the Nash strategies.
文摘The problem of incentive equilibria for dynamic games proposed in Ref.1 is considered.A sufficient condition and a necessary condition concerning the existence of incentive equilibria are proposed. As an illustrative example, the incentive equilibria for linear-quadratic games are discussed.
基金the National Natural Science Foundation of China under Grant No.71571108China Postdoctoral Science Foundation Funded Project under Grant No.2016M600525Qingdao Postdoctoral Application Research Project under Grant No.2016029。
文摘The transformation of characteristic functions is an effective way to avoid time-inconsistency of cooperative solutions in dynamic games.There are several forms on the transformation of characteristic functions.In this paper,a class of general transformation of characteristic functions is proposed.It can lead to the time-consistency of cooperative solutions and guarantee that the irrational-behaviorproof conditions hold true.To illustrate the theory,an example of dynamic game on a tree is given.
基金supported by the National Natural Science Foundation of China(Grant No.61933010 and 61903301)Shaanxi Aerospace Flight Vehicle Design Key Laboratory。
文摘Cooperative autonomous air combat of multiple unmanned aerial vehicles(UAVs)is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain information of the opponents.As such,this paper presents a cooperative decision-making method based on incomplete information dynamic game to generate maneuver strategies for multiple UAVs in air combat.Firstly,a cooperative situation assessment model is presented to measure the overall combat situation.Secondly,an incomplete information dynamic game model is proposed to model the dynamic process of air combat,and a dynamic Bayesian network is designed to infer the tactical intention of the opponent.Then a reinforcement learning framework based on multiagent deep deterministic policy gradient is established to obtain the perfect Bayes-Nash equilibrium solution of the air combat game model.Finally,a series of simulations are conducted to verify the effectiveness of the proposed method,and the simulation results show effective synergies and cooperative tactics.
文摘A certain constrained dynamic game is shown to be equivalent to a pair of symmetric dual variational problems which have more general formulation than those already existing in the literature. Various duality results are proved under convexity and generalized convexity assumptions on the appropriate functionals. The dynamic game is also viewed as equivalent to a pair of dual variational problems without the condition of fixed points. It is also indicated that the equivalent formulation of a pair of symmetric dual variational problems as dynamic generalization of those had been already studied in the literature. In essence, the purpose of the research is to establish that the solution of variational problems yields the solution of the dynamic game.
基金supported by the National Natural Science Foundation of China(No.U1766204)。
文摘A power source–power grid coordinated typhoon defense strategy is proposed in this study to minimize the cost of power grid anti-typhoon reinforcement measures and improve defense efficiency.It is based on multiagent dynamic game theory.This strategy regards a typhoon as a rational gamer that always causes the greatest damage.Together with the grid planner and black start unit(BSU)planner,it forms a multiagent defense–attack–defense dynamic game model naturally.The model is adopted to determine the optimal reinforcements for the transmission lines,black start power capacity,and location.Typhoon Hato,which struck a partial coastal area in Guangdong province in China in 2017,was adopted to formulate a step-by-step model of a typhoon attacking coastal area power systems.The results were substituted into the multiagent defense–attack–defense dynamic game model to obtain the optimal transmission line reinforcement positions,as well as optimal BSU capacity and geographic positions.An effective typhoon defense strategy and minimum load shedding were achieved,demonstrating the feasibility and correctness of the proposed strategy.The related theories and methods of this study have positive significance for the prevention of uncertain large-scale natural disasters.
基金This work is supported by the National Natural Science Foundation of China(Grant No.61971032)Fundamental Research Funds for the Central Universities(Grant No.FRF-TP-18-008A3).
文摘The goal of delivering high-quality service has spurred research of 6G satellite communication networks.The limited resource-allocation problem has been addressed by next-generation satellite communication networks,especially multilayer networks with multiple low-Earth-orbit(LEO)and nonlow-Earth-orbit(NLEO)satellites.In this study,the resource-allocation problem of a multilayer satellite network consisting of one NLEO and multiple LEO satellites is solved.The NLEO satellite is the authorized user of spectrum resources and the LEO satellites are unauthorized users.The resource allocation and dynamic pricing problems are combined,and a dynamic gamebased resource pricing and allocation model is proposed to maximize the market advantage of LEO satellites and reduce interference between LEO and NLEO satellites.In the proposed model,the resource price is formulated as the dynamic state of the LEO satellites,using the resource allocation strategy as the control variable.Based on the proposed dynamic game model,an openloop Nash equilibrium is analyzed,and an algorithm is proposed for the resource pricing and allocation problem.Numerical simulations validate the model and algorithm.
基金College Doctor Foundation (20060699026)Aviation Basic Scientific Foundation (05D53021).
文摘As to oppositional, multi-objective and hierarchical characteristic of air formation to ground attackdefends campaign, and using dynamic space state model of military campaign, this article establishes a principal and subordinate hierarchical interactive decision-making way, the Nash-Stackelberg-Nash model, to solve the problems in military operation, and find out the associated best strategy in hierarchical dynamic decision-making. The simulating result indicate that when applying the model to air formation to ground attack-defends decision-making system, it can solve the problems of two hierarchies, dynamic oppositional decision-making favorably, and reach preferable effect in battle. It proves that the model can provide an effective way for analyzing a battle,
文摘Using economics and game theory, two kinds of models have been proposed in this paper under the assumption that foreign and domestic firms behave under the condition of dynamic game of perfect information. One model is for calculating Anti-dumping rate which is obtained according to current regulations of Anti-dumping, but it is not optimal. The other is an optimal model of Anti-dumping which is obtained according to the maximum principle of domestic social welfare. Then, through the comparison of this two models in detail, several shortages have been revealed about Anti-dumping rate model based on current regulations of Anti-dumping. Finally, a suggestion is indicated that WTO and China should use the optimal model to calculate Anti-dumping rate.
基金supported by National Natural Science Foundation of China(No.72171126)China Postdoctoral Science Foundation(No.2016M600525)Qingdao Postdoctoral Application Research Project(No.2016029).
文摘In this paper,the irrational-behavior-proof conditions in a class of stochastic dynamic games over event trees are presented.Four kinds of irrational-behavior-proof conditions are proposed by the imputation distribution procedure,and their relationships are discussed.More specific properties for the general transformation of characteristic functions are developed,based on which,the irrational-behavior-proof conditions are proved to be true in a transformed cooperative game.
基金supported by Consejo Nacional de Ciencia y Tecnología of Mexico (Grant No. 221291)
文摘Potential games are noncooperative games for which there exist auxiliary functions, called potentials,such that the maximizers of the potential are also Nash equilibria of the corresponding game. Some properties of Nash equilibria, such as existence or stability, can be derived from the potential, whenever it exists. We survey different classes of potential games in the static and dynamic cases, with a finite number of players, as well as in population games where a continuum of players is allowed. Likewise, theoretical concepts and applications are discussed by means of illustrative examples.
基金supported by the Deanship of Scientific Research at King Fahd University of Petroleum & Minerals Project(No.JF141002)the National Science Foundation(No.ECCS-1405173)+3 种基金the Office of Naval Research(Nos.N000141310562,N000141410718)the U.S. Army Research Office(No.W911NF-11-D-0001)the National Natural Science Foundation of China(No.61120106011)the Project 111 from the Ministry of Education of China(No.B08015)
文摘This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize to the state of a command multi-agent dynamical systems, where pinning control is used generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents' dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.
文摘The solvability of the coupled Riccati differential equations appearing in the differential game approach to the formation control problem is vital to the finite horizon Nash equilibrium solution.These equations(if solvable)can be solved numerically by using the terminal value and the backward iteration.To investigate the solvability and solution of these equations the formation control problem as the differential game is replaced by a discrete-time dynamic game.The main contributions of this paper are as follows.First,the existence of Nash equilibrium controls for the discretetime formation control problem is shown.Second,a backward iteration approximate solution to the coupled Riccati differential equations in the continuous-time differential game is developed.An illustrative example is given to justify the models and solution.
基金supported by the National Natural Science Foundation of China(41804035,41374027)。
文摘The spoofing capability of Global Navigation Satellite System(GNSS)represents an important confrontational capability for navigation security,and the success of planned missions may depend on the effective evaluation of spoofing capability.However,current evaluation systems face challenges arising from the irrationality of previous weighting methods,inapplicability of the conventional multi-attribute decision-making method and uncertainty existing in evaluation.To solve these difficulties,considering the validity of the obtained results,an evaluation method based on the game aggregated weight model and a joint approach involving the grey relational analysis and technique for order preference by similarity to an ideal solution(GRA-TOPSIS)are firstly proposed to determine the optimal scheme.Static and dynamic evaluation results under different schemes are then obtained via a fuzzy comprehensive assessment and an improved dynamic game method,to prioritize the deceptive efficacy of the equipment accurately and make pointed improvement for its core performance.The use of judging indicators,including Spearman rank correlation coefficient and so on,combined with obtained evaluation results,demonstrates the superiority of the proposed method and the optimal scheme by the horizontal comparison of different methods and vertical comparison of evaluation results.Finally,the results of field measurements and simulation tests show that the proposed method can better overcome the difficulties of existing methods and realize the effective evaluation.
基金This work was supported by the National Key R&D Program of China(Grant No.2019YFE0123600)National Science Foundation of China(Grant No.52077146)Young Elite Scientists Sponsorship Program by CSEE(Grant No.CESS-YESS-2019027).
文摘It is expected that multiple virtual power plants(multi-VPPs)will join and participate in the future local energy market(LEM).The trading behaviors of these VPPs needs to be carefully studied in order to maximize the benefits brought to the local energy market operator(LEMO)and each VPP.We propose a bounded rationality-based trading model of multiVPPs in the local energy market by using a dynamic game approach with different trading targets.Three types of power bidding models for VPPs are first set up with different trading targets.In the dynamic game process,VPPs can also improve the degree of rationality and then find the most suitable target for different requirements by evolutionary learning after considering the opponents’bidding strategies and its own clustered resources.LEMO would decide the electricity buying/selling price in the LEM.Furthermore,the proposed dynamic game model is solved by a hybrid method consisting of an improved particle swarm optimization(IPSO)algorithm and conventional largescale optimization.Finally,case studies are conducted to show the performance of the proposed model and solution approach,which may provide some insights for VPPs to participate in the LEM in real-world complex scenarios.
文摘Based on the basis of the two stage dynamic game of complete information and purely tactful perfect equilibrium theory, the non cooperative gaming between the police department and the criminals is analyzed. Dynamic game can be proved to forecast and explain potential tactful choices of the police department and the criminals at various stages, so as to analyze the essence of the law enforcement by the theoretical models.
文摘In managing an international project, claims are very important. In this paper, a complete information dynamic game model is designed; with the Nash equilibrium values, the huge influence of claim cost on claim strategy is testified and the importance of claims to both sides of a contract especially the contractor is elucidated. Claim chances are also discussed with game theory. At last, from the angle of a repeated game and by comparison with Pareto optimization and Nash equilibrium values, it is concluded that the best payoff can be obtained with a honest attitude and through cooperation between companies.