零和博弈

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
Moonscar讨论 | 贡献2020年5月12日 (二) 17:14的版本 (Moved page from wikipedia:en:Zero-sum game (history))
(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳到导航 跳到搜索

此词条暂由彩云小译翻译,未经人工整理和审校,带来阅读不便,请见谅。

模板:Distinguish

模板:Other uses

In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero. Thus, cutting a cake, where taking a larger piece reduces the amount of cake available for others as much as it increases the amount available for that taker, is a zero-sum game if all participants value each unit of cake equally (see marginal utility).

In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero. Thus, cutting a cake, where taking a larger piece reduces the amount of cake available for others as much as it increases the amount available for that taker, is a zero-sum game if all participants value each unit of cake equally (see marginal utility).

在博弈论和经济学理论中,零和博弈是一种数学描述,在这种情况下,每个参与者的效用收益与其他参与者的效用收益的损失完全平衡。如果将参与者的总收益加起来,再减去总损失,则它们之和为零。因此,切蛋糕是一个零和游戏,如果所有的参与者都平等地评价每一块蛋糕的价值,那么切大块的蛋糕会减少其他人可以得到的蛋糕数量,同时也会增加那个人可以得到的蛋糕边际效用。


In contrast, non-zero-sum describes a situation in which the interacting parties' aggregate gains and losses can be less than or more than zero. A zero-sum game is also called a strictly competitive game while non-zero-sum games can be either competitive or non-competitive. Zero-sum games are most often solved with the minimax theorem which is closely related to linear programming duality,[1] or with Nash equilibrium.

In contrast, non-zero-sum describes a situation in which the interacting parties' aggregate gains and losses can be less than or more than zero. A zero-sum game is also called a strictly competitive game while non-zero-sum games can be either competitive or non-competitive. Zero-sum games are most often solved with the minimax theorem which is closely related to linear programming duality, or with Nash equilibrium.

相比之下,非零和描述了一种情况,在这种情况下,相互作用的各方的总体收益和损失可能小于或大于零。零和博弈也称为严格竞争博弈,而非零和博弈可以是竞争博弈,也可以是非竞争博弈。零和博弈通常是用极大极小定理来解决的,这个定理与线性规划二元性密切相关,或者用纳什均衡点来解决。


Humans have a cognitive bias towards seeing situations as zero-sum, known as zero-sum bias.

Humans have a cognitive bias towards seeing situations as zero-sum, known as zero-sum bias.

人类有一种认知偏见,认为情况是零和的,也就是所谓的零和偏见。


Definition

模板:Payoff matrix

1D Choice 2 | DL c,-c | DR-d,d }


The zero-sum property (if one gains, another loses) means that any result of a zero-sum situation is Pareto optimal. Generally, any game where all strategies are Pareto optimal is called a conflict game.[2]

The zero-sum property (if one gains, another loses) means that any result of a zero-sum situation is Pareto optimal. Generally, any game where all strategies are Pareto optimal is called a conflict game.

零和性质(如果一方获得,另一方损失)意味着零和情况的任何结果都是帕累托最优的。一般来说,所有策略都是帕累托最优的博弈称为冲突博弈。


Zero-sum games are a specific example of constant sum games where the sum of each outcome is always zero. Such games are distributive, not integrative; the pie cannot be enlarged by good negotiation.

Zero-sum games are a specific example of constant sum games where the sum of each outcome is always zero. Such games are distributive, not integrative; the pie cannot be enlarged by good negotiation.

零和博弈是常数和博弈的一个具体例子,其中每个结果的和总是为零。这种游戏是分配性的,而不是综合性的; 好的谈判无法扩大这块蛋糕。


Situations where participants can all gain or suffer together are referred to as non-zero-sum. Thus, a country with an excess of bananas trading with another country for their excess of apples, where both benefit from the transaction, is in a non-zero-sum situation. Other non-zero-sum games are games in which the sum of gains and losses by the players are sometimes more or less than what they began with.

Situations where participants can all gain or suffer together are referred to as non-zero-sum. Thus, a country with an excess of bananas trading with another country for their excess of apples, where both benefit from the transaction, is in a non-zero-sum situation. Other non-zero-sum games are games in which the sum of gains and losses by the players are sometimes more or less than what they began with.

参与者可以共同获益或共同受苦的情况称为非零和。因此,一个香蕉过剩的国家与另一个国家交易过剩的苹果,双方都从交易中受益,这是一种非零和情况。其他非零和博弈是这样一种博弈,在这种博弈中,参与者的得失之和有时大于或小于他们开始时的得失之和。


The idea of Pareto optimal payoff in a zero-sum game gives rise to a generalized relative selfish rationality standard, the punishing-the-opponent standard, where both players always seek to minimize the opponent's payoff at a favorable cost to himself rather to prefer more than less. The punishing-the-opponent standard can be used in both zero-sum games (e.g. warfare game, chess) and non-zero-sum games (e.g. pooling selection games).[3]

The idea of Pareto optimal payoff in a zero-sum game gives rise to a generalized relative selfish rationality standard, the punishing-the-opponent standard, where both players always seek to minimize the opponent's payoff at a favorable cost to himself rather to prefer more than less. The punishing-the-opponent standard can be used in both zero-sum games (e.g. warfare game, chess) and non-zero-sum games (e.g. pooling selection games).

在零和博弈中,帕累托最优收益的概念产生了一个广义的相对自私的理性标准,即惩罚对手的标准,在这个标准中,双方总是以对自己有利的代价来寻求最小化对手的收益,而不是偏好多于少。惩罚对手标准可以用在零和游戏中。战争游戏,国际象棋)和非零和游戏(例如:。集合选择游戏)。


Solution

For two-player finite zero-sum games, the different game theoretic solution concepts of Nash equilibrium, minimax, and maximin all give the same solution. If the players are allowed to play a mixed strategy, the game always has an equilibrium.

For two-player finite zero-sum games, the different game theoretic solution concepts of Nash equilibrium, minimax, and maximin all give the same solution. If the players are allowed to play a mixed strategy, the game always has an equilibrium.

对于双人有限零和对策,纳什均衡点、极大极小和极大的不同对策理论解概念都给出了相同的解。如果允许参与者采用混合策略,博弈总是存在均衡。


Example

{ | class“ wikitable”样式“ float: right; margin-left: 1em; ”
A zero-sum game A zero-sum game
零和游戏 模板:Diagonal split header }} }} 模板:Blue 模板:Blue 模板:Blue
模板:Red 模板:Diagonal split header white}}

会发生什么

模板:Diagonal split header white}}

会发生什么

模板:Diagonal split header white}}

会发生什么

模板:Red 模板:Diagonal split header white}}

会发生什么

模板:Diagonal split header white}}

会发生什么

模板:Diagonal split header white}}

会发生什么

|}


A game's payoff matrix is a convenient representation. Consider for example the two-player zero-sum game pictured at right or above.

A game's payoff matrix is a convenient representation. Consider for example the two-player zero-sum game pictured at right or above.

博弈的支付矩阵是一种方便的表示形式。例如,考虑图中右上方的两人零和游戏。


The order of play proceeds as follows: The first player (red) chooses in secret one of the two actions 1 or 2; the second player (blue), unaware of the first player's choice, chooses in secret one of the three actions A, B or C. Then, the choices are revealed and each player's points total is affected according to the payoff for those choices.

The order of play proceeds as follows: The first player (red) chooses in secret one of the two actions 1 or 2; the second player (blue), unaware of the first player's choice, chooses in secret one of the three actions A, B or C. Then, the choices are revealed and each player's points total is affected according to the payoff for those choices.

游戏的顺序如下: 第一个玩家(红色)秘密地在两个动作1或2中选择一个; 第二个玩家(蓝色)不知道第一个玩家的选择,秘密地在三个动作 a、 b 或 c 中选择一个,然后,选择被显示出来,每个玩家的总分受到这些选择的回报的影响。


Example: Red chooses action 2 and Blue chooses action B. When the payoff is allocated, Red gains 20 points and Blue loses 20 points.

Example: Red chooses action 2 and Blue chooses action B. When the payoff is allocated, Red gains 20 points and Blue loses 20 points.

例如: 红选择操作2,蓝选择操作 b。当回报被分配时,红色获得20点,蓝色失去20点。


In this example game, both players know the payoff matrix and attempt to maximize the number of their points. Red could reason as follows: "With action 2, I could lose up to 20 points and can win only 20, and with action 1 I can lose only 10 but can win up to 30, so action 1 looks a lot better." With similar reasoning, Blue would choose action C. If both players take these actions, Red will win 20 points. If Blue anticipates Red's reasoning and choice of action 1, Blue may choose action B, so as to win 10 points. If Red, in turn, anticipates this trick and goes for action 2, this wins Red 20 points.

In this example game, both players know the payoff matrix and attempt to maximize the number of their points. Red could reason as follows: "With action 2, I could lose up to 20 points and can win only 20, and with action 1 I can lose only 10 but can win up to 30, so action 1 looks a lot better." With similar reasoning, Blue would choose action C. If both players take these actions, Red will win 20 points. If Blue anticipates Red's reasoning and choice of action 1, Blue may choose action B, so as to win 10 points. If Red, in turn, anticipates this trick and goes for action 2, this wins Red 20 points.

在这个例子中,两个玩家都知道支付矩阵,并试图最大化他们的分数。红队的理由如下: “在第二场比赛中,我可能输掉20分,只能赢20分,而在第一场比赛中,我只能输掉10分,但可以赢得30分,所以第一场比赛看起来要好得多。”根据类似的推理,蓝方会选择动作 c。如果两个玩家都采取这些动作,红方会赢得20分。如果蓝色预测红色的推理和行动选择1,蓝色可能会选择行动 b,以赢得10点。如果红色,反过来,预测这个把戏,并去行动2,这赢得红色20点。


Émile Borel and John von Neumann had the fundamental insight that probability provides a way out of this conundrum. Instead of deciding on a definite action to take, the two players assign probabilities to their respective actions, and then use a random device which, according to these probabilities, chooses an action for them. Each player computes the probabilities so as to minimize the maximum expected point-loss independent of the opponent's strategy. This leads to a linear programming problem with the optimal strategies for each player. This minimax method can compute probably optimal strategies for all two-player zero-sum games.

Émile Borel and John von Neumann had the fundamental insight that probability provides a way out of this conundrum. Instead of deciding on a definite action to take, the two players assign probabilities to their respective actions, and then use a random device which, according to these probabilities, chooses an action for them. Each player computes the probabilities so as to minimize the maximum expected point-loss independent of the opponent's strategy. This leads to a linear programming problem with the optimal strategies for each player. This minimax method can compute probably optimal strategies for all two-player zero-sum games.

和约翰·冯·诺伊曼的基本观点是概率提供了解决这个难题的方法。两个参与者不是决定采取什么明确的行动,而是分配各自行动的可能性,然后使用一个随机设备,根据这些可能性,为他们选择一个行动。每个参与人计算概率,以最小化最大期望点损失独立于对手的策略。这就导致了每个参与者的最优策略的线性规划问题。这种极大极小方法可以计算出所有两人零和对策的最优策略。


For the example given above, it turns out that Red should choose action 1 with probability 模板:Sfrac and action 2 with probability 模板:Sfrac, and Blue should assign the probabilities 0, 模板:Sfrac, and 模板:Sfrac to the three actions A, B, and C. Red will then win 模板:Sfrac points on average per game.

For the example given above, it turns out that Red should choose action 1 with probability and action 2 with probability , and Blue should assign the probabilities 0, , and to the three actions A, B, and C. Red will then win points on average per game.

对于上面给出的例子,结果表明,红色应该选择行动1的概率和行动2的概率,和蓝色应该指定的概率为0,和三个行动的 a,b,和 c 红色将赢得平均每场比赛的分数。


Solving

The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem. Suppose a zero-sum game has a payoff matrix M where element M模板:Sub is the payoff obtained when the minimizing player chooses pure strategy i and the maximizing player chooses pure strategy j (i.e. the player trying to minimize the payoff chooses the row and the player trying to maximize the payoff chooses the column). Assume every element of M is positive. The game will have at least one Nash equilibrium. The Nash equilibrium can be found (Raghavan 1994, p. 740) by solving the following linear program to find a vector u:

The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem. Suppose a zero-sum game has a payoff matrix where element }} is the payoff obtained when the minimizing player chooses pure strategy and the maximizing player chooses pure strategy (i.e. the player trying to minimize the payoff chooses the row and the player trying to maximize the payoff chooses the column). Assume every element of is positive. The game will have at least one Nash equilibrium. The Nash equilibrium can be found (Raghavan 1994, p. 740) by solving the following linear program to find a vector :

一个双人零和游戏的纳什均衡点可以通过解决一个线性规划问题来找到。假设一个零和对策有一个支付矩阵,其中元素}是当最小化对策者选择纯策略而最大化对策者选择纯策略时所获得的支付。试图最小化回报的参与人选择行,而试图最大化回报的参与人选择列)。假设每个元素都是正数。这个游戏至少有一个纳什均衡点。可以通过解决下面的线性程序找到一个向量来找到纳什均衡点:


Minimize:
Minimize:

最小化:

[math]\displaystyle{ \sum_{i} u_i }[/math]
[math]\displaystyle{ \sum_{i} u_i }[/math]

数学,数学

Subject to the constraints:
Subject to the constraints:

视乎限制而定:

u ≥ 0
M u ≥ 1.
.
.


The first constraint says each element of the u vector must be nonnegative, and the second constraint says each element of the M u vector must be at least 1. For the resulting u vector, the inverse of the sum of its elements is the value of the game. Multiplying u by that value gives a probability vector, giving the probability that the maximizing player will choose each of the possible pure strategies.

The first constraint says each element of the vector must be nonnegative, and the second constraint says each element of the vector must be at least 1. For the resulting vector, the inverse of the sum of its elements is the value of the game. Multiplying by that value gives a probability vector, giving the probability that the maximizing player will choose each of the possible pure strategies.

第一个约束表示向量的每个元素必须是非负的,第二个约束表示向量的每个元素必须至少是1。对于得到的向量,其元素和的倒数是游戏的值。乘以这个值得到一个概率向量,给出了最大化的玩家选择每个可能的纯策略的概率。


If the game matrix does not have all positive elements, simply add a constant to every element that is large enough to make them all positive. That will increase the value of the game by that constant, and will have no effect on the equilibrium mixed strategies for the equilibrium.

If the game matrix does not have all positive elements, simply add a constant to every element that is large enough to make them all positive. That will increase the value of the game by that constant, and will have no effect on the equilibrium mixed strategies for the equilibrium.

如果博弈矩阵没有所有的正元素,只要在每个元素上加一个常数,这个常数足够大,使得它们都是正的。这个常数会增加博弈的价值,对均衡的混合策略没有影响。


The equilibrium mixed strategy for the minimizing player can be found by solving the dual of the given linear program. Or, it can be found by using the above procedure to solve a modified payoff matrix which is the transpose and negation of M (adding a constant so it's positive), then solving the resulting game.

The equilibrium mixed strategy for the minimizing player can be found by solving the dual of the given linear program. Or, it can be found by using the above procedure to solve a modified payoff matrix which is the transpose and negation of (adding a constant so it's positive), then solving the resulting game.

通过求解给定线性规划的对偶问题,可以找到最小化问题的均衡混合策略。或者,可以用上述方法求解一个修正的支付矩阵,它是(加一个常数使其为正)的转置和否定,然后求解结果博弈。


If all the solutions to the linear program are found, they will constitute all the Nash equilibria for the game. Conversely, any linear program can be converted into a two-player, zero-sum game by using a change of variables that puts it in the form of the above equations. So such games are equivalent to linear programs, in general.[citation needed]

If all the solutions to the linear program are found, they will constitute all the Nash equilibria for the game. Conversely, any linear program can be converted into a two-player, zero-sum game by using a change of variables that puts it in the form of the above equations. So such games are equivalent to linear programs, in general.

如果找到线性规划的所有解,它们就构成了博弈的所有纳什均衡。相反,任何线性规划可以转换成一个两人,零和博弈使用变量的变化,使其成为上述方程的形式。所以这样的博弈一般等价于线性规划。


Universal solution

If avoiding a zero-sum game is an action choice with some probability for players, avoiding is always an equilibrium strategy for at least one player at a zero-sum game. For any two players zero-sum game where a zero-zero draw is impossible or non-credible after the play is started, such as poker, there is no Nash equilibrium strategy other than avoiding the play. Even if there is a credible zero-zero draw after a zero-sum game is started, it is not better than the avoiding strategy. In this sense, it's interesting to find reward-as-you-go in optimal choice computation shall prevail over all two players zero-sum games with regard to starting the game or not.[4]

If avoiding a zero-sum game is an action choice with some probability for players, avoiding is always an equilibrium strategy for at least one player at a zero-sum game. For any two players zero-sum game where a zero-zero draw is impossible or non-credible after the play is started, such as poker, there is no Nash equilibrium strategy other than avoiding the play. Even if there is a credible zero-zero draw after a zero-sum game is started, it is not better than the avoiding strategy. In this sense, it's interesting to find reward-as-you-go in optimal choice computation shall prevail over all two players zero-sum games with regard to starting the game or not.

如果回避一个零和博弈是一个具有一定概率的行动选择,那么在零和博弈中,至少一个参与者的回避总是一个均衡策略。对于任何一个零和游戏的玩家来说,在游戏开始后零和游戏是不可能的或者不可信的,比如说扑克,除了回避游戏之外没有其他的纳什均衡点策略。即使在一场零和游戏开始后出现了可信的零比零平局,这也不比回避策略好。在这个意义上,有趣的是在最优选择计算中找到随行奖励将优先于所有两个玩家的零和游戏,关于是否开始游戏。






The most common or simple example from the subfield of social psychology is the concept of "social traps". In some cases pursuing individual personal interest can enhance the collective well-being of the group, but in other situations all parties pursuing personal interest results in mutually destructive behavior.

The most common or simple example from the subfield of social psychology is the concept of "social traps". In some cases pursuing individual personal interest can enhance the collective well-being of the group, but in other situations all parties pursuing personal interest results in mutually destructive behavior.

社会心理学分支领域最常见或最简单的例子就是“社会陷阱”的概念。在某些情况下,追求个人利益可以提高集体的幸福感,但在其他情况下,追求个人利益的所有各方都会导致相互破坏的行为。


Complexity

It has been theorized by Robert Wright in his book Nonzero: The Logic of Human Destiny, that society becomes increasingly non-zero-sum as it becomes more complex, specialized, and interdependent.

It has been theorized by Robert Wright in his book Nonzero: The Logic of Human Destiny, that society becomes increasingly non-zero-sum as it becomes more complex, specialized, and interdependent.

罗伯特 · 赖特在他的《非零: 人类命运的逻辑》一书中提出了这样的理论: 当社会变得越来越复杂、专门化和相互依存时,它就会变得越来越非零和。



Movies

See the plot of Arrival

See the plot of Arrival

看看《抵达》的情节


Extensions

In 1944, John von Neumann and Oskar Morgenstern proved that any non-zero-sum game for n players is equivalent to a zero-sum game with n + 1 players; the (n + 1)th player representing the global profit or loss.[5]

In 1944, John von Neumann and Oskar Morgenstern proved that any non-zero-sum game for n players is equivalent to a zero-sum game with n + 1 players; the (n + 1)th player representing the global profit or loss.

1944年,约翰·冯·诺伊曼和奥斯卡·摩根斯腾证明了 n 个玩家的任何非零和博弈等价于 n + 1个玩家的零和博弈; 第(n + 1)个玩家代表全球盈亏。


Misunderstandings

Zero-sum games and particularly their solutions are commonly misunderstood by critics of game theory, usually with respect to the independence and rationality of the players, as well as to the interpretation of utility functions. Furthermore, the word "game" does not imply the model is valid only for recreational games.[1]

Zero-sum games and particularly their solutions are commonly misunderstood by critics of game theory, usually with respect to the independence and rationality of the players, as well as to the interpretation of utility functions. Furthermore, the word "game" does not imply the model is valid only for recreational games.

零和博弈,尤其是零和博弈的解决方案,常常被博弈论的批评家们所误解,通常涉及到玩家的独立性和理性,以及效用函数的解释。此外,“游戏”一词并不意味着该模型仅适用于娱乐游戏。


Politics is sometimes called zero sum.[6][7][8]

Politics is sometimes called zero sum.

政治有时被称为零和游戏。


Zero-sum thinking

In psychology, zero-sum thinking refers to the perception that a situation is like a zero-sum game, where one person's gain is another's loss.

In psychology, zero-sum thinking refers to the perception that a situation is like a zero-sum game, where one person's gain is another's loss.

在心理学中,零和思维指的是这样一种感觉,即某种情况就像一个零和游戏,一个人的得到就是另一个人的损失。


See also

模板:Col div


References

  1. 1.0 1.1 Ken Binmore (2007). Playing for real: a text on game theory. Oxford University Press US. ISBN 978-0-19-530057-4. https://books.google.com/?id=eY0YhSk9ujsC. , chapters 1 & 7
  2. Bowles, Samuel (2004). Microeconomics: Behavior, Institutions, and Evolution. Princeton University Press. pp. 33–36. ISBN 0-691-09163-3. 
  3. Wenliang Wang (2015). Pooling Game Theory and Public Pension Plan. . Chapter 1 and Chapter 4.
  4. Wenliang Wang (2015). Pooling Game Theory and Public Pension Plan. . Chapter 4.
  5. Theory of Games and Economic Behavior. Princeton University Press (1953). June 25, 2005. ISBN 9780691130613. https://press.princeton.edu/titles/7802.html. Retrieved 2018-02-25. 
  6. Rubin, Jennifer (2013-10-04). "The flaw in zero sum politics". The Washington Post. Retrieved 2017-03-08.
  7. "Lexington: Zero-sum politics". The Economist. 2014-02-08. Retrieved 2017-03-08.
  8. 模板:Cite dictionary


Further reading

  • Misstating the Concept of Zero-Sum Games within the Context of Professional Sports Trading Strategies, series Pardon the Interruption (2010-09-23) ESPN, created by Tony Kornheiser and Michael Wilbon, performance by Bill Simmons