零和博弈

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
水流心不竞讨论 | 贡献2020年10月31日 (六) 10:53的版本
跳到导航 跳到搜索

此词条暂由水流心不竞初译,翻译字数共1421,未经审校,带来阅读不便,请见谅。

模板:Distinguish

模板:Other uses

In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero. Thus, cutting a cake, where taking a larger piece reduces the amount of cake available for others as much as it increases the amount available for that taker, is a zero-sum game if all participants value each unit of cake equally (see marginal utility).

In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero. Thus, cutting a cake, where taking a larger piece reduces the amount of cake available for others as much as it increases the amount available for that taker, is a zero-sum game if all participants value each unit of cake equally (see marginal utility).

在博弈论和经济理论中, 零和博弈Zero-sum game是一种数学描述,其中每个参与者的效用收益与其他参与者的效用收益的损失完全平衡。如果将参与者的总收益加起来,再减去总损失,则它们之和为零。因此,如果所有的参与者都平等地评价每一块蛋糕,那么切蛋糕就是一个零和游戏,切一块蛋糕会减少给其他人的蛋糕数量,同时也会增加给那个接受者的边际效用。


In contrast, non-zero-sum describes a situation in which the interacting parties' aggregate gains and losses can be less than or more than zero. A zero-sum game is also called a strictly competitive game while non-zero-sum games can be either competitive or non-competitive. Zero-sum games are most often solved with the minimax theorem which is closely related to linear programming duality,[1] or with Nash equilibrium.

In contrast, non-zero-sum describes a situation in which the interacting parties' aggregate gains and losses can be less than or more than zero. A zero-sum game is also called a strictly competitive game while non-zero-sum games can be either competitive or non-competitive. Zero-sum games are most often solved with the minimax theorem which is closely related to linear programming duality,

相比之下,非零和描述了(另)一种情况,在这种情况下,相互作用的各方的总体收益和损失可能小于或大于零。零和博弈也称为严格竞争博弈,而非零和博弈可以是竞争博弈,也可以是非竞争博弈。零和博弈通常是用极大极小定理来解决的,这个定理与线性规划二元性密切相关,


Many people have a cognitive bias towards seeing situations as zero-sum, known as zero-sum bias.

许多人对将情况视为零和有认知偏差,称为零和偏差

Zero-sum games are a specific example of constant sum games where the sum of each outcome is always zero. Such games are distributive, not integrative; the pie cannot be enlarged by good negotiation.

零和博弈是常数和博弈的一个具体例子,其中每个结果的和总是为零。这种游戏是分配性的,而不是综合性的; 好的谈判无法扩大这块蛋糕。


Definition 定义

Situations where participants can all gain or suffer together are referred to as non-zero-sum. Thus, a country with an excess of bananas trading with another country for their excess of apples, where both benefit from the transaction, is in a non-zero-sum situation. Other non-zero-sum games are games in which the sum of gains and losses by the players are sometimes more or less than what they began with.

参与者可以共同获益或共同受苦的情况称为 非零和Non-zero-sum。因此,如果一个国家有过量的香蕉与另一个国家进行交易以换取其过剩的苹果,而这两个国家都从交易中受益,那么这个国家就处于一种非零和情况。其他 非零和博弈是这样一种博弈,在这种博弈中,参与者的得与失之和有时大于或小于他们开始时的水平。


模板:Payoff matrix


For two-player finite zero-sum games, the different game theoretic solution concepts of Nash equilibrium, minimax, and maximin all give the same solution. If the players are allowed to play a mixed strategy, the game always has an equilibrium.

对于双人有限零和对策, 纳什均衡点Nash equilibrium、极大极小和极大的不同对策理论解概念都给出了相同的解。如果允许参与者采用混合策略,博弈总是存在均衡。

The zero-sum property (if one gains, another loses) means that any result of a zero-sum situation is Pareto optimal. Generally, any game where all strategies are Pareto optimal is called a conflict game.[2]


Zero-sum games are a specific example of constant sum games where the sum of each outcome is always zero. Such games are distributive, not integrative; the pie cannot be enlarged by good negotiation.


{ | class = “ wikable” style = “ float: right; margin-left: 1em; ” Situations where participants can all gain or suffer together are referred to as non-zero-sum. Thus, a country with an excess of bananas trading with another country for their excess of apples, where both benefit from the transaction, is in a non-zero-sum situation. Other non-zero-sum games are games in which the sum of gains and losses by the players are sometimes more or less than what they began with. For two-player finite zero-sum games, the different game theoreticsolution concepts of Nash equilibrium, minimax, and maximin all give the same solution. If the players are allowed to play a mixed strategy, the game always has an equilibrium.
A zero-sum game
零和游戏


}} }}

The idea of Pareto optimal payoff in a zero-sum game gives rise to a generalized relative selfish rationality standard, the punishing-the-opponent standard, where both players always seek to minimize the opponent's payoff at a favorable cost to himself rather to prefer more than less. The punishing-the-opponent standard can be used in both zero-sum games (e.g. warfare game, chess) and non-zero-sum games (e.g. pooling selection games).[3]


Solution



white}}

会发生什么

Example

white}}

会发生什么


white}}

会发生什么

A zero-sum game
模板:Diagonal split header white}}

会发生什么

模板:Blue white}}

会发生什么

模板:Blue white}}

会发生什么

模板:Blue

|-

! 模板:Red

A game's payoff matrix is a convenient representation. Consider for example the two-player zero-sum game pictured at right or above.

博弈的支付矩阵是一种方便的表示形式。例如,考虑图中右上方的两人零和游戏。

| 模板:Diagonal split header

| 模板:Diagonal split header

The order of play proceeds as follows: The first player (red) chooses in secret one of the two actions 1 or 2; the second player (blue), unaware of the first player's choice, chooses in secret one of the three actions A, B or C. Then, the choices are revealed and each player's points total is affected according to the payoff for those choices.

游戏的顺序如下: 第一个玩家(红色)秘密地在两个动作1或2中选择一个; 第二个玩家(蓝色)不知道第一个玩家的选择,秘密地在三个动作 a、 b 或 c 中选择一个,然后,选择被显示出来,每个玩家的总分受到这些选择的回报的影响。

| 模板:Diagonal split header

|-

Example: Red chooses action 2 and Blue chooses action B. When the payoff is allocated, Red gains 20 points and Blue loses 20 points.

例如: 红选择操作2,蓝选择操作 b。当回报被分配时,红色获得20点,蓝色失去20点。

! 模板:Red

| 模板:Diagonal split header

In this example game, both players know the payoff matrix and attempt to maximize the number of their points. Red could reason as follows: "With action 2, I could lose up to 20 points and can win only 20, and with action 1 I can lose only 10 but can win up to 30, so action 1 looks a lot better." With similar reasoning, Blue would choose action C. If both players take these actions, Red will win 20 points. If Blue anticipates Red's reasoning and choice of action 1, Blue may choose action B, so as to win 10 points. If Red, in turn, anticipates this trick and goes for action 2, this wins Red 20 points.

在这个例子中,两个玩家都知道支付矩阵,并试图最大化他们的分数。红队的理由如下: “在第二场比赛中,我可能输掉20分,只能赢20分,而在第一场比赛中,我只能输掉10分,但可以赢得30分,所以第一场比赛看起来要好得多。”根据类似的推理,蓝方会选择动作 c。如果两个玩家都采取这些动作,红方会赢得20分。如果蓝色预料到红色的推理和行动1的选择,蓝色可能会选择行动 b,从而赢得10点。如果红色,反过来,预测这个把戏,并去行动2,这赢得红色20点。

| 模板:Diagonal split header

| 模板:Diagonal split header

Émile Borel and John von Neumann had the fundamental insight that probability provides a way out of this conundrum. Instead of deciding on a definite action to take, the two players assign probabilities to their respective actions, and then use a random device which, according to these probabilities, chooses an action for them. Each player computes the probabilities so as to minimize the maximum expected point-loss independent of the opponent's strategy. This leads to a linear programming problem with the optimal strategies for each player. This minimax method can compute probably optimal strategies for all two-player zero-sum games.

和约翰·冯·诺伊曼的基本观点是概率提供了一条解决这个难题的途径。两个参与者不是决定采取什么明确的行动,而是分配各自行动的可能性,然后使用一个随机设备,根据这些可能性,为他们选择一个行动。每个参与人计算概率,以最小化最大期望点损失独立于对手的策略。这就导致了每个玩家的最优策略的线性规划问题。这种极大极小方法可以计算出所有两人零和对策的最优策略。

|}


For the example given above, it turns out that Red should choose action 1 with probability and action 2 with probability , and Blue should assign the probabilities 0, , and to the three actions A, B, and C. Red will then win points on average per game.

对于上面给出的例子,结果表明,红色应该选择行动1的概率和行动2的概率,和蓝色应该指定的概率为0,和三个行动的 a,b,和 c 红色将赢得平均每场比赛的分数。

A game's payoff matrix is a convenient representation. Consider for example the two-player zero-sum game pictured at right or above.


The order of play proceeds as follows: The first player (red) chooses in secret one of the two actions 1 or 2; the second player (blue), unaware of the first player's choice, chooses in secret one of the three actions A, B or C. Then, the choices are revealed and each player's points total is affected according to the payoff for those choices.


The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem. Suppose a zero-sum game has a payoff matrix where element }} is the payoff obtained when the minimizing player chooses pure strategy and the maximizing player chooses pure strategy (i.e. the player trying to minimize the payoff chooses the row and the player trying to maximize the payoff chooses the column). Assume every element of is positive. The game will have at least one Nash equilibrium. The Nash equilibrium can be found (Raghavan 1994, p. 740) by solving the following linear program to find a vector :

一个双人零和游戏的纳什均衡点可以通过解决一个线性规划问题来找到。假设一个零和对策有一个支付矩阵,其中元素}是最小化对策者选择纯策略而最大化对策者选择纯策略(即最小化对策者选择纯策略)所获得的支付。试图最小化回报的参与人选择行,而试图最大化回报的参与人选择列)。假设元素的每个元素都是正的。这个游戏至少有一个纳什均衡点。可以通过解决下面的线性程序找到一个向量来找到纳什均衡点:

Example: Red chooses action 2 and Blue chooses action B. When the payoff is allocated, Red gains 20 points and Blue loses 20 points.


Minimize:

最小化:

In this example game, both players know the payoff matrix and attempt to maximize the number of their points. Red could reason as follows: "With action 2, I could lose up to 20 points and can win only 20, and with action 1 I can lose only 10 but can win up to 30, so action 1 looks a lot better." With similar reasoning, Blue would choose action C. If both players take these actions, Red will win 20 points. If Blue anticipates Red's reasoning and choice of action 1, Blue may choose action B, so as to win 10 points. If Red, in turn, anticipates this trick and goes for action 2, this wins Red 20 points.

[math]\displaystyle{ \sum_{i} u_i }[/math]

[ math ] sum { i } u i


Subject to the constraints:

视乎限制而定:

Émile Borel and John von Neumann had the fundamental insight that probability provides a way out of this conundrum. Instead of deciding on a definite action to take, the two players assign probabilities to their respective actions, and then use a random device which, according to these probabilities, chooses an action for them. Each player computes the probabilities so as to minimize the maximum expected point-loss independent of the opponent's strategy. This leads to a linear programming problem with the optimal strategies for each player. This minimax method can compute probably optimal strategies for all two-player zero-sum games.


.
.

For the example given above, it turns out that Red should choose action 1 with probability 模板:Sfrac and action 2 with probability 模板:Sfrac, and Blue should assign the probabilities 0, 模板:Sfrac, and 模板:Sfrac to the three actions A, B, and C. Red will then win 模板:Sfrac points on average per game.


The first constraint says each element of the vector must be nonnegative, and the second constraint says each element of the vector must be at least 1. For the resulting vector, the inverse of the sum of its elements is the value of the game. Multiplying by that value gives a probability vector, giving the probability that the maximizing player will choose each of the possible pure strategies.

第一个约束表示向量的每个元素必须是非负的,第二个约束表示向量的每个元素必须至少是1。对于得到的向量,其元素和的倒数是游戏的值。乘以这个值得到一个概率向量,给出了最大化的玩家选择每个可能的纯策略的概率。

Solving

If the game matrix does not have all positive elements, simply add a constant to every element that is large enough to make them all positive. That will increase the value of the game by that constant, and will have no effect on the equilibrium mixed strategies for the equilibrium.

如果博弈矩阵没有所有的正元素,只要在每个元素上加一个常数,这个常数足够大,使得它们都是正的。这个常数会增加博弈的价值,对均衡的混合策略没有影响。

The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem. Suppose a zero-sum game has a payoff matrix M where element M模板:Sub is the payoff obtained when the minimizing player chooses pure strategy i and the maximizing player chooses pure strategy j (i.e. the player trying to minimize the payoff chooses the row and the player trying to maximize the payoff chooses the column). Assume every element of M is positive. The game will have at least one Nash equilibrium. The Nash equilibrium can be found (Raghavan 1994, p. 740) by solving the following linear program to find a vector u:


The equilibrium mixed strategy for the minimizing player can be found by solving the dual of the given linear program. Or, it can be found by using the above procedure to solve a modified payoff matrix which is the transpose and negation of (adding a constant so it's positive), then solving the resulting game.

通过求解给定线性规划的对偶问题,可以找到最小化问题的均衡混合策略。或者,可以用上述方法求解一个修正的支付矩阵,它是(加一个常数使其为正)的转置和否定,然后求解结果博弈。

Minimize:
[math]\displaystyle{ \sum_{i} u_i }[/math]

If all the solutions to the linear program are found, they will constitute all the Nash equilibria for the game. Conversely, any linear program can be converted into a two-player, zero-sum game by using a change of variables that puts it in the form of the above equations. So such games are equivalent to linear programs, in general.

如果找到线性规划的所有解,它们就构成了博弈的所有纳什均衡。相反,任何线性规划可以转换成一个两人,零和博弈使用变量的变化,使其成为上述方程的形式。所以这样的博弈一般等价于线性规划。

Subject to the constraints:
u ≥ 0
M u ≥ 1.


If avoiding a zero-sum game is an action choice with some probability for players, avoiding is always an equilibrium strategy for at least one player at a zero-sum game. For any two players zero-sum game where a zero-zero draw is impossible or non-credible after the play is started, such as poker, there is no Nash equilibrium strategy other than avoiding the play. Even if there is a credible zero-zero draw after a zero-sum game is started, it is not better than the avoiding strategy. In this sense, it's interesting to find reward-as-you-go in optimal choice computation shall prevail over all two players zero-sum games with regard to starting the game or not.

如果回避零和博弈是一个具有一定概率的行动选择,那么在零和博弈中,至少一个参与者的回避总是一个均衡策略。对于任何一个零和游戏的玩家来说,在游戏开始后零和游戏是不可能的或者不可信的,比如说扑克,除了回避游戏之外没有其他的纳什均衡点策略。即使在零和博弈开始后出现了可信的零比零平局,这也不比回避策略好。从这个意义上说,在最优选择计算中找到随走随奖是一件很有趣的事情,在开始与否的问题上,所有两个玩家的零和游戏都优先。

The first constraint says each element of the u vector must be nonnegative, and the second constraint says each element of the M u vector must be at least 1. For the resulting u vector, the inverse of the sum of its elements is the value of the game. Multiplying u by that value gives a probability vector, giving the probability that the maximizing player will choose each of the possible pure strategies.


If the game matrix does not have all positive elements, simply add a constant to every element that is large enough to make them all positive. That will increase the value of the game by that constant, and will have no effect on the equilibrium mixed strategies for the equilibrium.


The equilibrium mixed strategy for the minimizing player can be found by solving the dual of the given linear program. Or, it can be found by using the above procedure to solve a modified payoff matrix which is the transpose and negation of M (adding a constant so it's positive), then solving the resulting game.


The most common or simple example from the subfield of social psychology is the concept of "social traps". In some cases pursuing individual personal interest can enhance the collective well-being of the group, but in other situations all parties pursuing personal interest results in mutually destructive behavior.

社会心理学分支领域最常见或最简单的例子就是“社会陷阱”的概念。在某些情况下,追求个人利益可以增强集体的幸福感,但在其他情况下,追求个人利益的各方都会导致相互破坏的行为。

If all the solutions to the linear program are found, they will constitute all the Nash equilibria for the game. Conversely, any linear program can be converted into a two-player, zero-sum game by using a change of variables that puts it in the form of the above equations. So such games are equivalent to linear programs, in general.[citation needed]


Universal solution

It has been theorized by Robert Wright in his book Nonzero: The Logic of Human Destiny, that society becomes increasingly non-zero-sum as it becomes more complex, specialized, and interdependent.

罗伯特 · 赖特在他的《非零: 人类命运的逻辑》一书中提出了这样的理论: 当社会变得越来越复杂、专门化和相互依存时,它就会变得越来越非零和。

If avoiding a zero-sum game is an action choice with some probability for players, avoiding is always an equilibrium strategy for at least one player at a zero-sum game. For any two players zero-sum game where a zero-zero draw is impossible or non-credible after the play is started, such as poker, there is no Nash equilibrium strategy other than avoiding the play. Even if there is a credible zero-zero draw after a zero-sum game is started, it is not better than the avoiding strategy. In this sense, it's interesting to find reward-as-you-go in optimal choice computation shall prevail over all two players zero-sum games with regard to starting the game or not.[4]





In 1944, John von Neumann and Oskar Morgenstern proved that any non-zero-sum game for n players is equivalent to a zero-sum game with n + 1 players; the (n + 1)th player representing the global profit or loss.

1944年,约翰·冯·诺伊曼和奥斯卡·摩根斯腾证明了 n 个玩家的任何非零和博弈等价于 n + 1个玩家的零和博弈,第(n + 1)个玩家代表全球的盈亏。


The most common or simple example from the subfield of social psychology is the concept of "social traps". In some cases pursuing individual personal interest can enhance the collective well-being of the group, but in other situations all parties pursuing personal interest results in mutually destructive behavior.


Complexity

Zero-sum games and particularly their solutions are commonly misunderstood by critics of game theory, usually with respect to the independence and rationality of the players, as well as to the interpretation of utility functions. Furthermore, the word "game" does not imply the model is valid only for recreational games.

零和博弈,尤其是零和博弈的解决方案,常常被博弈论的批评家们所误解,通常涉及到玩家的独立性和理性,以及效用函数的解释。此外,“游戏”一词并不意味着该模型仅适用于娱乐游戏。


It has been theorized by Robert Wright in his book Nonzero: The Logic of Human Destiny, that society becomes increasingly non-zero-sum as it becomes more complex, specialized, and interdependent.

Politics is sometimes called zero sum.

政治有时被称为零和游戏。



Extensions

In psychology, zero-sum thinking refers to the perception that a situation is like a zero-sum game, where one person's gain is another's loss.

在心理学中,零和思维指的是这样一种感觉,即某种情况就像一个零和游戏,一个人的得到就是另一个人的损失。

In 1944, John von Neumann and Oskar Morgenstern proved that any non-zero-sum game for n players is equivalent to a zero-sum game with n + 1 players; the (n + 1)th player representing the global profit or loss.[5]


Misunderstandings

Zero-sum games and particularly their solutions are commonly misunderstood by critics of game theory, usually with respect to the independence and rationality of the players, as well as to the interpretation of utility functions. Furthermore, the word "game" does not imply the model is valid only for recreational games.[1]


Politics is sometimes called zero sum.[6][7][8]


Zero-sum thinking

In psychology, zero-sum thinking refers to the perception that a situation is like a zero-sum game, where one person's gain is another's loss.


See also

模板:Col div


References

  1. 1.0 1.1 Ken Binmore (2007). Playing for real: a text on game theory. Oxford University Press US. ISBN 978-0-19-530057-4. https://books.google.com/?id=eY0YhSk9ujsC. , chapters 1 & 7
  2. Bowles, Samuel (2004). Microeconomics: Behavior, Institutions, and Evolution. Princeton University Press. pp. 33–36. ISBN 0-691-09163-3. https://archive.org/details/microeconomicsbe00bowl. 
  3. Wenliang Wang (2015). Pooling Game Theory and Public Pension Plan. . Chapter 1 and Chapter 4.
  4. Wenliang Wang (2015). Pooling Game Theory and Public Pension Plan. . Chapter 4.
  5. Theory of Games and Economic Behavior. Princeton University Press (1953). June 25, 2005. ISBN 9780691130613. https://press.princeton.edu/titles/7802.html. Retrieved 2018-02-25. 
  6. Rubin, Jennifer (2013-10-04). "The flaw in zero sum politics". The Washington Post. Retrieved 2017-03-08.
  7. "Lexington: Zero-sum politics". The Economist. 2014-02-08. Retrieved 2017-03-08.
  8. 模板:Cite dictionary


Further reading

  • Misstating the Concept of Zero-Sum Games within the Context of Professional Sports Trading Strategies, series Pardon the Interruption (2010-09-23) ESPN, created by Tony Kornheiser and Michael Wilbon, performance by Bill Simmons