In this paper, we consider multiobjective two-person zero-sum games with vector payoffs and vector fuzzy payoffs. We translate such games into the corresponding multiobjective programming problems and introduce the pe...In this paper, we consider multiobjective two-person zero-sum games with vector payoffs and vector fuzzy payoffs. We translate such games into the corresponding multiobjective programming problems and introduce the pessimistic Pareto optimal solution concept by assuming that a player supposes the opponent adopts the most disadvantage strategy for the self. It is shown that any pessimistic Pareto optimal solution can be obtained on the basis of linear programming techniques even if the membership functions for the objective functions are nonlinear. Moreover, we propose interactive algorithms based on the bisection method to obtain a pessimistic compromise solution from among the set of all pessimistic Pareto optimal solutions. In order to show the efficiency of the proposed method, we illustrate interactive processes of an application to a vegetable shipment problem.展开更多
Nowadays,China is the largest developing country in the world,and the US is the largest developed country in the world.Sino-US economic and trade relations are of great significance to the two nations and may have apr...Nowadays,China is the largest developing country in the world,and the US is the largest developed country in the world.Sino-US economic and trade relations are of great significance to the two nations and may have aprominent impact on the stability and development of the global economy.展开更多
There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum ga...There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum game, and prove that it is theoretically a polynomial time algorithm. We implement the Karmarkar method, and a preliminary computational result shows that it performs well for zero-sum games. We also mention an affine scaling method that would help us compute Nash equilibria of general zero-sum games effectively.展开更多
This paper will present an approximate/adaptive dynamic programming(ADP) algorithm,that uses the idea of integral reinforcement learning(IRL),to determine online the Nash equilibrium solution for the two-player zerosu...This paper will present an approximate/adaptive dynamic programming(ADP) algorithm,that uses the idea of integral reinforcement learning(IRL),to determine online the Nash equilibrium solution for the two-player zerosum differential game with linear dynamics and infinite horizon quadratic cost.The algorithm is built around an iterative method that has been developed in the control engineering community for solving the continuous-time game algebraic Riccati equation(CT-GARE),which underlies the game problem.We here show how the ADP techniques will enhance the capabilities of the offline method allowing an online solution without the requirement of complete knowledge of the system dynamics.The feasibility of the ADP scheme is demonstrated in simulation for a power system control application.The adaptation goal is the best control policy that will face in an optimal manner the highest load disturbance.展开更多
In this paper we study zero-sum stochastic games. The optimality criterion is the long-run expected average criterion, and the payoff function may have neither upper nor lower bounds. We give a new set of conditions f...In this paper we study zero-sum stochastic games. The optimality criterion is the long-run expected average criterion, and the payoff function may have neither upper nor lower bounds. We give a new set of conditions for the existence of a value and a pair of optimal stationary strategies. Our conditions are slightly weaker than those in the previous literature, and some new sufficient conditions for the existence of a pair of optimal stationary strategies are imposed on the primitive data of the model. Our results are illustrated with a queueing system, for which our conditions are satisfied but some of the conditions in some previous literatures fail to hold.展开更多
In this paper,a zero-sum game Nash equilibrium computation problem with a common constraint set is investigated under two time-varying multi-agent subnetworks,where the two subnetworks have opposite payoff function.A ...In this paper,a zero-sum game Nash equilibrium computation problem with a common constraint set is investigated under two time-varying multi-agent subnetworks,where the two subnetworks have opposite payoff function.A novel distributed projection subgradient algorithm with random sleep scheme is developed to reduce the calculation amount of agents in the process of computing Nash equilibrium.In our algorithm,each agent is determined by an independent identically distributed Bernoulli decision to compute the subgradient and perform the projection operation or to keep the previous consensus estimate,it effectively reduces the amount of computation and calculation time.Moreover,the traditional assumption of stepsize adopted in the existing methods is removed,and the stepsizes in our algorithm are randomized diminishing.Besides,we prove that all agents converge to Nash equilibrium with probability 1 by our algorithm.Finally,a simulation example verifies the validity of our algorithm.展开更多
In this paper,a zero-sum game Nash equilibrium computation problem with event-triggered communication is investigated under an undirected weight-balanced multi-agent network.A novel distributed event-triggered project...In this paper,a zero-sum game Nash equilibrium computation problem with event-triggered communication is investigated under an undirected weight-balanced multi-agent network.A novel distributed event-triggered projection subgradient algorithm is developed to reduce the communication burden within the subnetworks.In the proposed algorithm,when the difference between the current state of the agent and the state of the last trigger time exceeds a given threshold,the agent will be triggered to communicate with its neighbours.Moreover,we prove that all agents converge to Nash equilibrium by the proposed algorithm.Finally,two simulation examples verify that our algorithm not only reduces the communication burden but also ensures that the convergence speed and accuracy are close to that of the time-triggered method under the appropriate threshold.展开更多
This paper studies the policy iteration algorithm(PIA)for zero-sum stochastic differential games with the basic long-run average criterion,as well as with its more selective version,the so-called bias criterion.The sy...This paper studies the policy iteration algorithm(PIA)for zero-sum stochastic differential games with the basic long-run average criterion,as well as with its more selective version,the so-called bias criterion.The system is assumed to be a nondegenerate diffusion.We use Lyapunov-like stability conditions that ensure the existence and boundedness of the solution to certain Poisson equation.We also ensure the convergence of a sequence of such solutions,of the corresponding sequence of policies,and,ultimately,of the PIA.展开更多
We consider a finite horizon,zero-sum linear quadratic differential game.The feature of this game is that a weight matrix of the minimiser’s control cost in the cost functional is singular.Due to this singularity,the...We consider a finite horizon,zero-sum linear quadratic differential game.The feature of this game is that a weight matrix of the minimiser’s control cost in the cost functional is singular.Due to this singularity,the game can be solved neither by applying the Isaacs MinMax principle nor using the Bellman–Isaacs equation approach,i.e.this game is singular.Aprevious paper of one of the authors analysed such a game in the case where the cost functional does not contain the minimiser’s control cost at all,i.e.the weight matrix of this cost equals zero.In this case,all coordinates of the minimiser’s control are singular.In the present paper,we study the general case where the weight matrix of the minimiser’s control cost,being singular,is not,in general,zero.This means that only a part of the coordinates of the minimiser’s control is singular,while others are regular.The considered game is treated by a regularisation,i.e.by its approximate conversion to an auxiliary regular game.The latter has the same equation of dynamics and a similar cost functional augmented by an integral of the squares of the singular control coordinates with a small positive weight.Thus,the auxiliary game is a partial cheap control differential game.Based on a singular perturbation’s asymptotic analysis of this auxiliary game,the existence of the value of the original(singular)game is established,and its expression is obtained.The maximiser’s optimal state feedback strategy and the minimising control sequence in the original game are designed.It is shown that the coordinates of the minimising control sequence,corresponding to the regular coordinates of the minimiser’s control,are point-wise convergent in the class of regular functions.The optimal trajectory sequence and the optimal trajectory in the considered singular game also are obtained.An illustrative example is presented.展开更多
Keccak is one of the five hash functions selected for the final round of the SHA-3 competition,and its inner primitive is a permutation called Keccak-f.In this paper,we observe that for the inverse of the only nonline...Keccak is one of the five hash functions selected for the final round of the SHA-3 competition,and its inner primitive is a permutation called Keccak-f.In this paper,we observe that for the inverse of the only nonlinear transformation in Keccak-f,the algebraic degree of any output coordinate and the one of the product of any two output coordinates are both 3,which is 2 less than its size of 5.Combining this observation with a proposition on the upper bound of the degree of iterated permutations,we improve the zero-sum distinguisher for the Keccak-f permutation with full 24 rounds by lowering the size of the zero-sum partition from 21590 to 21575.展开更多
文摘In this paper, we consider multiobjective two-person zero-sum games with vector payoffs and vector fuzzy payoffs. We translate such games into the corresponding multiobjective programming problems and introduce the pessimistic Pareto optimal solution concept by assuming that a player supposes the opponent adopts the most disadvantage strategy for the self. It is shown that any pessimistic Pareto optimal solution can be obtained on the basis of linear programming techniques even if the membership functions for the objective functions are nonlinear. Moreover, we propose interactive algorithms based on the bisection method to obtain a pessimistic compromise solution from among the set of all pessimistic Pareto optimal solutions. In order to show the efficiency of the proposed method, we illustrate interactive processes of an application to a vegetable shipment problem.
文摘Nowadays,China is the largest developing country in the world,and the US is the largest developed country in the world.Sino-US economic and trade relations are of great significance to the two nations and may have aprominent impact on the stability and development of the global economy.
文摘There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum game, and prove that it is theoretically a polynomial time algorithm. We implement the Karmarkar method, and a preliminary computational result shows that it performs well for zero-sum games. We also mention an affine scaling method that would help us compute Nash equilibria of general zero-sum games effectively.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2006AA04Z183), National Natural Science Foundation of China (60621001, 60534010, 60572070, 60774048, 60728307), Program for Changjiang Scholars and Innovative Research Groups of China (60728307, 4031002)
基金supported by the National Science Foundation (No.ECCS-0801330)the Army Research Office (No.W91NF-05-1-0314)
文摘This paper will present an approximate/adaptive dynamic programming(ADP) algorithm,that uses the idea of integral reinforcement learning(IRL),to determine online the Nash equilibrium solution for the two-player zerosum differential game with linear dynamics and infinite horizon quadratic cost.The algorithm is built around an iterative method that has been developed in the control engineering community for solving the continuous-time game algebraic Riccati equation(CT-GARE),which underlies the game problem.We here show how the ADP techniques will enhance the capabilities of the offline method allowing an online solution without the requirement of complete knowledge of the system dynamics.The feasibility of the ADP scheme is demonstrated in simulation for a power system control application.The adaptation goal is the best control policy that will face in an optimal manner the highest load disturbance.
文摘In this paper we study zero-sum stochastic games. The optimality criterion is the long-run expected average criterion, and the payoff function may have neither upper nor lower bounds. We give a new set of conditions for the existence of a value and a pair of optimal stationary strategies. Our conditions are slightly weaker than those in the previous literature, and some new sufficient conditions for the existence of a pair of optimal stationary strategies are imposed on the primitive data of the model. Our results are illustrated with a queueing system, for which our conditions are satisfied but some of the conditions in some previous literatures fail to hold.
文摘In this paper,a zero-sum game Nash equilibrium computation problem with a common constraint set is investigated under two time-varying multi-agent subnetworks,where the two subnetworks have opposite payoff function.A novel distributed projection subgradient algorithm with random sleep scheme is developed to reduce the calculation amount of agents in the process of computing Nash equilibrium.In our algorithm,each agent is determined by an independent identically distributed Bernoulli decision to compute the subgradient and perform the projection operation or to keep the previous consensus estimate,it effectively reduces the amount of computation and calculation time.Moreover,the traditional assumption of stepsize adopted in the existing methods is removed,and the stepsizes in our algorithm are randomized diminishing.Besides,we prove that all agents converge to Nash equilibrium with probability 1 by our algorithm.Finally,a simulation example verifies the validity of our algorithm.
文摘In this paper,a zero-sum game Nash equilibrium computation problem with event-triggered communication is investigated under an undirected weight-balanced multi-agent network.A novel distributed event-triggered projection subgradient algorithm is developed to reduce the communication burden within the subnetworks.In the proposed algorithm,when the difference between the current state of the agent and the state of the last trigger time exceeds a given threshold,the agent will be triggered to communicate with its neighbours.Moreover,we prove that all agents converge to Nash equilibrium by the proposed algorithm.Finally,two simulation examples verify that our algorithm not only reduces the communication burden but also ensures that the convergence speed and accuracy are close to that of the time-triggered method under the appropriate threshold.
文摘This paper studies the policy iteration algorithm(PIA)for zero-sum stochastic differential games with the basic long-run average criterion,as well as with its more selective version,the so-called bias criterion.The system is assumed to be a nondegenerate diffusion.We use Lyapunov-like stability conditions that ensure the existence and boundedness of the solution to certain Poisson equation.We also ensure the convergence of a sequence of such solutions,of the corresponding sequence of policies,and,ultimately,of the PIA.
文摘We consider a finite horizon,zero-sum linear quadratic differential game.The feature of this game is that a weight matrix of the minimiser’s control cost in the cost functional is singular.Due to this singularity,the game can be solved neither by applying the Isaacs MinMax principle nor using the Bellman–Isaacs equation approach,i.e.this game is singular.Aprevious paper of one of the authors analysed such a game in the case where the cost functional does not contain the minimiser’s control cost at all,i.e.the weight matrix of this cost equals zero.In this case,all coordinates of the minimiser’s control are singular.In the present paper,we study the general case where the weight matrix of the minimiser’s control cost,being singular,is not,in general,zero.This means that only a part of the coordinates of the minimiser’s control is singular,while others are regular.The considered game is treated by a regularisation,i.e.by its approximate conversion to an auxiliary regular game.The latter has the same equation of dynamics and a similar cost functional augmented by an integral of the squares of the singular control coordinates with a small positive weight.Thus,the auxiliary game is a partial cheap control differential game.Based on a singular perturbation’s asymptotic analysis of this auxiliary game,the existence of the value of the original(singular)game is established,and its expression is obtained.The maximiser’s optimal state feedback strategy and the minimising control sequence in the original game are designed.It is shown that the coordinates of the minimising control sequence,corresponding to the regular coordinates of the minimiser’s control,are point-wise convergent in the class of regular functions.The optimal trajectory sequence and the optimal trajectory in the considered singular game also are obtained.An illustrative example is presented.
基金supported by the National Natural Science Foundation of China (60573032,60773092 and 61073149)Research Fund for the Doctoral Program of Higher Education of China (20090073110027)
文摘Keccak is one of the five hash functions selected for the final round of the SHA-3 competition,and its inner primitive is a permutation called Keccak-f.In this paper,we observe that for the inverse of the only nonlinear transformation in Keccak-f,the algebraic degree of any output coordinate and the one of the product of any two output coordinates are both 3,which is 2 less than its size of 5.Combining this observation with a proposition on the upper bound of the degree of iterated permutations,we improve the zero-sum distinguisher for the Keccak-f permutation with full 24 rounds by lowering the size of the zero-sum partition from 21590 to 21575.