摘要
在博弈论中,单个个体控制全部个体的收益通常被认为是不可能的.一个例外是20世纪末在重复囚徒困境中提出的均衡器策略:使用这种策略的个体可以将对手的收益设置为由收益函数所决定的某个区间内的任意值.十余年后发现的零行列式策略通过单方面设置个体收益的线性关系,将该结果一般化.在此基础上,关于博弈收益控制的研究取得了一系列成果.本文概述了博弈收益控制的研究现状;介绍了单次博弈和重复博弈中的收益控制技术;从收益控制的基本概念、能控制的收益关系、收益控制策略的形式和收益控制策略的演化特性等方面总结了博弈中收益控制的主要进展和成果;并讨论了博弈收益控制的未来发展趋势.
In game theory,a single player usually cannot control the payoffs of all players in a game.An exception is the equalizer strategy proposed at the end of the last century for prisoner’s dilemma,with which a player can set their opponent’s payoff to be any designated value in a certain interval,regardless of which strategy the opponent uses.This result was further generalized with the discovery of the zero-determinant(ZD)strategies,which allow a player to unilaterally enforce a linear relationship between his own payoff and that of the opponent.The question of how payoff control can be established has attracted significant attention from computer scientists,control theorists,and evolutionary biologists,and many new results have been subsequently derived.This paper discusses the latest advances in payoff control,enforcing either a linear or nonlinear relation in one-shot or repeated games.In particular,we highlight the above question from four aspects:the concept of payoff control,forms of payoff relation that can be established,strategies that can control payoffs,and the evolutionary behavior of these strategies.We also provide an outlook on the directions for future research.
作者
王龙
陈芳
陈星如
Long WANG;Fang CHEN;Xingru CHEN(Center for Systems and Control,Peking University,Beijing 100871,China;School of Sciences,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处
《中国科学:信息科学》
CSCD
北大核心
2023年第4期623-646,共24页
Scientia Sinica(Informationis)
基金
国家自然科学基金(批准号:62036002)资助项目。
关键词
博弈论
收益控制
零行列式策略
演化博弈论
策略设计
game theory
payoff control
zero-determinant strategy
evolutionary game theory
strategy design