零和博弈

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
Moonscar讨论 | 贡献2020年10月25日 (日) 21:08的版本 (Moved page from wikipedia:en:Zero-sum game (history))
跳到导航 跳到搜索

此词条暂由彩云小译翻译,翻译字数共1521,未经人工整理和审校,带来阅读不便,请见谅。

模板:Distinguish

模板:Other uses

In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero. Thus, cutting a cake, where taking a larger piece reduces the amount of cake available for others as much as it increases the amount available for that taker, is a zero-sum game if all participants value each unit of cake equally (see marginal utility).

In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero. Thus, cutting a cake, where taking a larger piece reduces the amount of cake available for others as much as it increases the amount available for that taker, is a zero-sum game if all participants value each unit of cake equally (see marginal utility).

在博弈论和经济理论中,零和博弈是一种数学描述,其中每个参与者的效用收益与其他参与者的效用收益的损失完全平衡。如果将参与者的总收益加起来,再减去总损失,那么它们之和为零。因此,如果所有的参与者都平等地评价每一块蛋糕,那么切蛋糕就是一个零和游戏,切得越大,其他人得到的蛋糕数量就越少,同时也增加了那个人得到的边际效用。


In contrast, non-zero-sum describes a situation in which the interacting parties' aggregate gains and losses can be less than or more than zero. A zero-sum game is also called a strictly competitive game while non-zero-sum games can be either competitive or non-competitive. Zero-sum games are most often solved with the minimax theorem which is closely related to linear programming duality,[1] or with Nash equilibrium.

In contrast, non-zero-sum describes a situation in which the interacting parties' aggregate gains and losses can be less than or more than zero. A zero-sum game is also called a strictly competitive game while non-zero-sum games can be either competitive or non-competitive. Zero-sum games are most often solved with the minimax theorem which is closely related to linear programming duality, or with Nash equilibrium.

相比之下,非零和描述了一种情况,在这种情况下,相互作用的各方的总体收益和损失可能小于或大于零。零和博弈也称为严格竞争博弈,而非零和博弈可以是竞争博弈,也可以是非竞争博弈。零和博弈通常是用极大极小定理来解决的,这个定理与线性规划二元性密切相关,或者用纳什均衡点来解决。


Many people have a cognitive bias towards seeing situations as zero-sum, known as zero-sum bias.

Many people have a cognitive bias towards seeing situations as zero-sum, known as zero-sum bias.

许多人有一种认知偏见,认为情况是零和的,也就是所谓的零和偏见。


Definition

模板:Payoff matrix

!|}}

! 模板:Blue

!

!

! 模板:Blue

!

!

! 模板:Blue

!

!

|-

|-

|-

! 模板:Red

!

!

| 模板:Diagonal split header

| ||white}}

我会开枪的

| 模板:Diagonal split header

| ||white}}

我会开枪的

| 模板:Diagonal split header

| ||white}}

我会开枪的

|-

|-

|-

! 模板:Red

!

!

| 模板:Diagonal split header

| ||white}}

我会开枪的

| 模板:Diagonal split header

| ||white}}

我会开枪的

| 模板:Diagonal split header

| ||white}}

会发生什么

|}

|}

|}


A game's payoff matrix is a convenient representation. Consider for example the two-player zero-sum game pictured at right or above.

A game's payoff matrix is a convenient representation. Consider for example the two-player zero-sum game pictured at right or above.

博弈的支付矩阵是一种方便的表示形式。例如,考虑图中右上方的两人零和游戏。


The order of play proceeds as follows: The first player (red) chooses in secret one of the two actions 1 or 2; the second player (blue), unaware of the first player's choice, chooses in secret one of the three actions A, B or C. Then, the choices are revealed and each player's points total is affected according to the payoff for those choices.

The order of play proceeds as follows: The first player (red) chooses in secret one of the two actions 1 or 2; the second player (blue), unaware of the first player's choice, chooses in secret one of the three actions A, B or C. Then, the choices are revealed and each player's points total is affected according to the payoff for those choices.

游戏的顺序如下: 第一个玩家(红色)秘密地在两个动作1或2中选择一个; 第二个玩家(蓝色)不知道第一个玩家的选择,秘密地在三个动作 a、 b 或 c 中选择一个,然后,选择被显示出来,每个玩家的总分受到这些选择的回报的影响。


Example: Red chooses action 2 and Blue chooses action B. When the payoff is allocated, Red gains 20 points and Blue loses 20 points.

Example: Red chooses action 2 and Blue chooses action B. When the payoff is allocated, Red gains 20 points and Blue loses 20 points.

例如: 红选择操作2,蓝选择操作 b。当回报被分配时,红色获得20点,蓝色失去20点。


In this example game, both players know the payoff matrix and attempt to maximize the number of their points. Red could reason as follows: "With action 2, I could lose up to 20 points and can win only 20, and with action 1 I can lose only 10 but can win up to 30, so action 1 looks a lot better." With similar reasoning, Blue would choose action C. If both players take these actions, Red will win 20 points. If Blue anticipates Red's reasoning and choice of action 1, Blue may choose action B, so as to win 10 points. If Red, in turn, anticipates this trick and goes for action 2, this wins Red 20 points.

In this example game, both players know the payoff matrix and attempt to maximize the number of their points. Red could reason as follows: "With action 2, I could lose up to 20 points and can win only 20, and with action 1 I can lose only 10 but can win up to 30, so action 1 looks a lot better." With similar reasoning, Blue would choose action C. If both players take these actions, Red will win 20 points. If Blue anticipates Red's reasoning and choice of action 1, Blue may choose action B, so as to win 10 points. If Red, in turn, anticipates this trick and goes for action 2, this wins Red 20 points.

在这个例子中,两个玩家都知道支付矩阵,并试图最大化他们的分数。红队的理由如下: “在第二场比赛中,我可能输掉20分,只能赢20分,而在第一场比赛中,我只能输掉10分,但可以赢得30分,所以第一场比赛看起来要好得多。”根据类似的推理,蓝方会选择动作 c。如果两个玩家都采取这些动作,红方会赢得20分。如果蓝色预料到红色的推理和行动1的选择,蓝色可能会选择行动 b,从而赢得10点。如果红色,反过来,预测这个把戏,并去行动2,这赢得红色20点。


Émile Borel and John von Neumann had the fundamental insight that probability provides a way out of this conundrum. Instead of deciding on a definite action to take, the two players assign probabilities to their respective actions, and then use a random device which, according to these probabilities, chooses an action for them. Each player computes the probabilities so as to minimize the maximum expected point-loss independent of the opponent's strategy. This leads to a linear programming problem with the optimal strategies for each player. This minimax method can compute probably optimal strategies for all two-player zero-sum games.

Émile Borel and John von Neumann had the fundamental insight that probability provides a way out of this conundrum. Instead of deciding on a definite action to take, the two players assign probabilities to their respective actions, and then use a random device which, according to these probabilities, chooses an action for them. Each player computes the probabilities so as to minimize the maximum expected point-loss independent of the opponent's strategy. This leads to a linear programming problem with the optimal strategies for each player. This minimax method can compute probably optimal strategies for all two-player zero-sum games.

和约翰·冯·诺伊曼的基本观点是概率提供了一条解决这个难题的途径。两个参与者不是决定采取什么明确的行动,而是分配各自行动的可能性,然后使用一个随机设备,根据这些可能性,为他们选择一个行动。每个参与人计算概率,以最小化最大期望点损失独立于对手的策略。这就导致了每个玩家的最优策略的线性规划问题。这种极大极小方法可以计算出所有两人零和对策的最优策略。


For the example given above, it turns out that Red should choose action 1 with probability 模板:Sfrac and action 2 with probability 模板:Sfrac, and Blue should assign the probabilities 0, 模板:Sfrac, and 模板:Sfrac to the three actions A, B, and C. Red will then win 模板:Sfrac points on average per game.

For the example given above, it turns out that Red should choose action 1 with probability and action 2 with probability , and Blue should assign the probabilities 0, , and to the three actions A, B, and C. Red will then win points on average per game.

对于上面给出的例子,结果表明,红色应该选择行动1的概率和行动2的概率,和蓝色应该指定的概率为0,和三个行动的 a,b,和 c 红色将赢得平均每场比赛的分数。


Solving

The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem. Suppose a zero-sum game has a payoff matrix M where element M模板:Sub is the payoff obtained when the minimizing player chooses pure strategy i and the maximizing player chooses pure strategy j (i.e. the player trying to minimize the payoff chooses the row and the player trying to maximize the payoff chooses the column). Assume every element of M is positive. The game will have at least one Nash equilibrium. The Nash equilibrium can be found (Raghavan 1994, p. 740) by solving the following linear program to find a vector u:

The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem. Suppose a zero-sum game has a payoff matrix where element }} is the payoff obtained when the minimizing player chooses pure strategy and the maximizing player chooses pure strategy (i.e. the player trying to minimize the payoff chooses the row and the player trying to maximize the payoff chooses the column). Assume every element of is positive. The game will have at least one Nash equilibrium. The Nash equilibrium can be found (Raghavan 1994, p. 740) by solving the following linear program to find a vector :

一个双人零和游戏的纳什均衡点可以通过解决一个线性规划问题来找到。假设一个零和对策有一个支付矩阵,其中元素}是最小化对策者选择纯策略而最大化对策者选择纯策略(即最小化对策者选择纯策略)所获得的支付。试图最小化回报的参与人选择行,而试图最大化回报的参与人选择列)。假设元素的每个元素都是正的。这个游戏至少有一个纳什均衡点。可以通过解决下面的线性程序找到一个向量来找到纳什均衡点:


Minimize:
Minimize:

最小化:

[math]\displaystyle{ \sum_{i} u_i }[/math]
[math]\displaystyle{ \sum_{i} u_i }[/math]

[ math ] sum { i } u i

Subject to the constraints:
Subject to the constraints:

视乎限制而定:

u ≥ 0
M u ≥ 1.
.
.


The first constraint says each element of the u vector must be nonnegative, and the second constraint says each element of the M u vector must be at least 1. For the resulting u vector, the inverse of the sum of its elements is the value of the game. Multiplying u by that value gives a probability vector, giving the probability that the maximizing player will choose each of the possible pure strategies.

The first constraint says each element of the vector must be nonnegative, and the second constraint says each element of the vector must be at least 1. For the resulting vector, the inverse of the sum of its elements is the value of the game. Multiplying by that value gives a probability vector, giving the probability that the maximizing player will choose each of the possible pure strategies.

第一个约束表示向量的每个元素必须是非负的,第二个约束表示向量的每个元素必须至少是1。对于得到的向量,其元素和的倒数是游戏的值。乘以这个值得到一个概率向量,给出了最大化的玩家选择每个可能的纯策略的概率。


If the game matrix does not have all positive elements, simply add a constant to every element that is large enough to make them all positive. That will increase the value of the game by that constant, and will have no effect on the equilibrium mixed strategies for the equilibrium.

If the game matrix does not have all positive elements, simply add a constant to every element that is large enough to make them all positive. That will increase the value of the game by that constant, and will have no effect on the equilibrium mixed strategies for the equilibrium.

如果博弈矩阵没有所有的正元素,只要在每个元素上加一个常数,这个常数足够大,使得它们都是正的。这个常数会增加博弈的价值,对均衡的混合策略没有影响。


The equilibrium mixed strategy for the minimizing player can be found by solving the dual of the given linear program. Or, it can be found by using the above procedure to solve a modified payoff matrix which is the transpose and negation of M (adding a constant so it's positive), then solving the resulting game.

The equilibrium mixed strategy for the minimizing player can be found by solving the dual of the given linear program. Or, it can be found by using the above procedure to solve a modified payoff matrix which is the transpose and negation of (adding a constant so it's positive), then solving the resulting game.

通过求解给定线性规划的对偶问题,可以找到最小化问题的均衡混合策略。或者,可以用上述方法求解一个修正的支付矩阵,它是(加一个常数使其为正)的转置和否定,然后求解结果博弈。


If all the solutions to the linear program are found, they will constitute all the Nash equilibria for the game. Conversely, any linear program can be converted into a two-player, zero-sum game by using a change of variables that puts it in the form of the above equations. So such games are equivalent to linear programs, in general.[citation needed]

If all the solutions to the linear program are found, they will constitute all the Nash equilibria for the game. Conversely, any linear program can be converted into a two-player, zero-sum game by using a change of variables that puts it in the form of the above equations. So such games are equivalent to linear programs, in general.

如果找到线性规划的所有解,它们就构成了博弈的所有纳什均衡。相反,任何线性规划可以转换成一个两人,零和博弈使用变量的变化,使其成为上述方程的形式。所以这样的博弈一般等价于线性规划。


Universal solution

If avoiding a zero-sum game is an action choice with some probability for players, avoiding is always an equilibrium strategy for at least one player at a zero-sum game. For any two players zero-sum game where a zero-zero draw is impossible or non-credible after the play is started, such as poker, there is no Nash equilibrium strategy other than avoiding the play. Even if there is a credible zero-zero draw after a zero-sum game is started, it is not better than the avoiding strategy. In this sense, it's interesting to find reward-as-you-go in optimal choice computation shall prevail over all two players zero-sum games with regard to starting the game or not.[2]

If avoiding a zero-sum game is an action choice with some probability for players, avoiding is always an equilibrium strategy for at least one player at a zero-sum game. For any two players zero-sum game where a zero-zero draw is impossible or non-credible after the play is started, such as poker, there is no Nash equilibrium strategy other than avoiding the play. Even if there is a credible zero-zero draw after a zero-sum game is started, it is not better than the avoiding strategy. In this sense, it's interesting to find reward-as-you-go in optimal choice computation shall prevail over all two players zero-sum games with regard to starting the game or not.

如果回避一个零和博弈是一个具有一定概率的行动选择,那么在零和博弈中,至少一个参与者的回避总是一个均衡策略。对于任何一个零和游戏的玩家来说,在游戏开始后零和游戏是不可能的或者不可信的,比如说扑克,除了回避游戏之外没有其他的纳什均衡点策略。即使在零和博弈开始后出现了可信的零比零平局,这也不比回避策略好。在这个意义上,有趣的是在最优选择计算中找到随走随奖,将优先于所有两个玩家的零和游戏,关于是否开始游戏。






The most common or simple example from the subfield of social psychology is the concept of "social traps". In some cases pursuing individual personal interest can enhance the collective well-being of the group, but in other situations all parties pursuing personal interest results in mutually destructive behavior.

The most common or simple example from the subfield of social psychology is the concept of "social traps". In some cases pursuing individual personal interest can enhance the collective well-being of the group, but in other situations all parties pursuing personal interest results in mutually destructive behavior.

社会心理学分支领域最常见或最简单的例子就是“社会陷阱”的概念。在某些情况下,追求个人利益可以提高集体的幸福感,但在其他情况下,追求个人利益的所有各方都会导致相互破坏的行为。


Complexity

It has been theorized by Robert Wright in his book Nonzero: The Logic of Human Destiny, that society becomes increasingly non-zero-sum as it becomes more complex, specialized, and interdependent.

It has been theorized by Robert Wright in his book Nonzero: The Logic of Human Destiny, that society becomes increasingly non-zero-sum as it becomes more complex, specialized, and interdependent.

罗伯特 · 赖特在他的《非零: 人类命运的逻辑》一书中提出了这样的理论: 当社会变得越来越复杂、专门化和相互依存时,它就会变得越来越非零和。



Extensions

In 1944, John von Neumann and Oskar Morgenstern proved that any non-zero-sum game for n players is equivalent to a zero-sum game with n + 1 players; the (n + 1)th player representing the global profit or loss.[3]

In 1944, John von Neumann and Oskar Morgenstern proved that any non-zero-sum game for n players is equivalent to a zero-sum game with n + 1 players; the (n + 1)th player representing the global profit or loss.

1944年,约翰·冯·诺伊曼和奥斯卡·摩根斯腾证明了 n 个玩家的任何非零和博弈等价于 n + 1个玩家的零和博弈; 第(n + 1)个玩家代表全球盈亏。


Misunderstandings

Zero-sum games and particularly their solutions are commonly misunderstood by critics of game theory, usually with respect to the independence and rationality of the players, as well as to the interpretation of utility functions. Furthermore, the word "game" does not imply the model is valid only for recreational games.[1]

Zero-sum games and particularly their solutions are commonly misunderstood by critics of game theory, usually with respect to the independence and rationality of the players, as well as to the interpretation of utility functions. Furthermore, the word "game" does not imply the model is valid only for recreational games.

零和博弈,特别是零和博弈的解决方案,常常被博弈论的批评家们所误解,通常涉及到博弈主体的独立性和理性,以及对效用函数的解释。此外,“游戏”一词并不意味着该模型仅适用于娱乐游戏。


Politics is sometimes called zero sum.[4][5][6]

Politics is sometimes called zero sum.

政治有时被称为零和游戏。


Zero-sum thinking

In psychology, zero-sum thinking refers to the perception that a situation is like a zero-sum game, where one person's gain is another's loss.

In psychology, zero-sum thinking refers to the perception that a situation is like a zero-sum game, where one person's gain is another's loss.

在心理学中,零和思维指的是这样一种感觉,即某种情况就像一个零和游戏,一个人的得到就是另一个人的损失。


See also

模板:Col div


References

  1. 1.0 1.1 Ken Binmore (2007). Playing for real: a text on game theory. Oxford University Press US. ISBN 978-0-19-530057-4. https://books.google.com/?id=eY0YhSk9ujsC. , chapters 1 & 7
  2. Wenliang Wang (2015). Pooling Game Theory and Public Pension Plan. . Chapter 4.
  3. Theory of Games and Economic Behavior. Princeton University Press (1953). June 25, 2005. ISBN 9780691130613. https://press.princeton.edu/titles/7802.html. Retrieved 2018-02-25. 
  4. Rubin, Jennifer (2013-10-04). "The flaw in zero sum politics". The Washington Post. Retrieved 2017-03-08.
  5. "Lexington: Zero-sum politics". The Economist. 2014-02-08. Retrieved 2017-03-08.
  6. 模板:Cite dictionary


Further reading

  • Misstating the Concept of Zero-Sum Games within the Context of Professional Sports Trading Strategies, series Pardon the Interruption (2010-09-23) ESPN, created by Tony Kornheiser and Michael Wilbon, performance by Bill Simmons