Prisoner's Dilemma (PD)囚徒困境博弈论
The prisoner's dilemma is a canonical example of a game analyzed in game theory that shows why two individuals might not cooperate, even if it appears that it is in their best interest to do so. It was originally framed by Merrill Flood and Melvin Dresher working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence payoffs and gave it the "prisoner's dilemma" name (Poundstone, 1992). A classic example of the prisoner's dilemma (PD) is presented as follows:
囚徒困境 (PD) 囚徒困境是博弈论中分析的一个典型博弈案例,它表明了为什么两个人可能不会合作,即使合作看起来对他们最有利。它最初是由 1950 年在兰德公司工作的 Merrill Flood 和 Melvin Dresher 提出的。Albert W. Tucker 用监禁收益形式化了该博弈,并赋予它“囚徒困境”的名称(Poundstone,1992 年)。囚徒困境 (PD) 的一个经典案例如下:
Two men are arrested, but the police do not possess enough information for a conviction. Following the separation of the two men, the police offer both a similar deal—if one testifies against his partner (defects/betrays), and the other remains silent (cooperates/assists), the betrayer goes free and the one that remains silent receives the full one-year sentence. If both remain silent, both are sentenced to only one month in jail for a minor charge. If each 'rats out' the other, each receives a three-month sentence. Each prisoner must choose either to betray or remain silent; the decision of each is kept quiet. What should they do? If it is supposed here that each player is only concerned with lessening his time in jail, the game becomes a non-zero sum game where the two players may either assist or betray the other. In the game, the sole worry of the prisoners seems to be increasing his own reward. The interesting symmetry of this problem is that the logical decision leads each to betray the other, even though their individual ‘prize’ would be greater if they cooperated.
两名男子被捕,但警方没有足够的信息定罪。在两人分开后,警方向两人提出了类似的交易——如果一人作证指控其同伙(叛变/背叛),而另一人保持沉默(合作/协助),背叛者将无罪释放,而保持沉默的人将被判处一年的全额刑期。如果两人都保持沉默,两人都只会因轻罪被判处一个月监禁。如果每个人都“出卖”对方,每个人都会被判处三个月的监禁。每个囚犯都必须选择背叛或保持沉默;每个人的决定都保密。他们应该怎么做?如果假设每个玩家只关心减少自己的监禁时间,那么游戏就变成了非零和游戏,两个玩家要么协助,要么背叛对方。在游戏中,囚犯唯一担心的似乎是增加自己的奖励。这个问题有趣的对称性在于,逻辑决定导致每个人都背叛对方,即使他们没有背叛对方。
In the regular version of this game, collaboration is dominated by betrayal, and as a result, the only possible outcome of the game is for both prisoners to betray the other. Regardless of what the other prisoner chooses, one will always gain a greater payoff by betraying the other. Because betrayal is always more beneficial than cooperation, all objective prisoners would seemingly betray the other if operating purely rationally. However, in reality humans display a systematic bias towards cooperative behavior in Prisoner's dilemma and similar games, much more so than predicted by a theory based only on rational self interested action.
在这个游戏的常规版本中,合作以背叛为主,因此,游戏的唯一可能结果是两个囚犯都背叛对方。无论另一个囚犯选择什么,一个人总是会通过背叛另一个人获得更大的回报。因为背叛总是比合作更有利,所以如果完全理性地行动,所有客观的囚犯似乎都会背叛对方。然而,在现实中,人类在囚徒困境和类似的游戏中表现出对合作行为的系统性偏见,比仅基于理性自利行为的理论所预测的要严重得多。
In the extended form game, the game is played over and over, and consequently, both prisoners continuously have an opportunity to penalize the other for the previous decision. If the number of times the game will be played is known, the finite aspect of the game means that by backward induction, the two prisoners will betray each other repeatedly.
在扩展形式的博弈中,博弈会反复进行,因此,两名囚犯都有机会因之前的决定而惩罚对方。如果知道博弈的次数,那么博弈的有限性意味着,通过反向归纳,两名囚犯将反复背叛对方。
In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games, for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation.
在日常使用中,“囚徒困境”这个标签可能适用于不严格符合经典或迭代博弈的正式标准的情况,例如,两个实体可以从合作中获得重要利益,或者因合作失败而遭受损失,但发现协调他们的活动以实现合作只是困难或昂贵,而不一定不可能。
The prisoner's dilemma is a canonical example of a game analyzed in game theory that shows why two individuals might not cooperate, even if it appears that it is in their best interest to do so. It was originally framed by Merrill Flood and Melvin Dresher working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence payoffs and gave it the "prisoner's dilemma" name (Poundstone, 1992). A classic example of the prisoner's dilemma (PD) is presented as follows:
囚徒困境 (PD) 囚徒困境是博弈论中分析的一个典型博弈案例,它表明了为什么两个人可能不会合作,即使合作看起来对他们最有利。它最初是由 1950 年在兰德公司工作的 Merrill Flood 和 Melvin Dresher 提出的。Albert W. Tucker 用监禁收益形式化了该博弈,并赋予它“囚徒困境”的名称(Poundstone,1992 年)。囚徒困境 (PD) 的一个经典案例如下:
Two men are arrested, but the police do not possess enough information for a conviction. Following the separation of the two men, the police offer both a similar deal—if one testifies against his partner (defects/betrays), and the other remains silent (cooperates/assists), the betrayer goes free and the one that remains silent receives the full one-year sentence. If both remain silent, both are sentenced to only one month in jail for a minor charge. If each 'rats out' the other, each receives a three-month sentence. Each prisoner must choose either to betray or remain silent; the decision of each is kept quiet. What should they do? If it is supposed here that each player is only concerned with lessening his time in jail, the game becomes a non-zero sum game where the two players may either assist or betray the other. In the game, the sole worry of the prisoners seems to be increasing his own reward. The interesting symmetry of this problem is that the logical decision leads each to betray the other, even though their individual ‘prize’ would be greater if they cooperated.
两名男子被捕,但警方没有足够的信息定罪。在两人分开后,警方向两人提出了类似的交易——如果一人作证指控其同伙(叛变/背叛),而另一人保持沉默(合作/协助),背叛者将无罪释放,而保持沉默的人将被判处一年的全额刑期。如果两人都保持沉默,两人都只会因轻罪被判处一个月监禁。如果每个人都“出卖”对方,每个人都会被判处三个月的监禁。每个囚犯都必须选择背叛或保持沉默;每个人的决定都保密。他们应该怎么做?如果假设每个玩家只关心减少自己的监禁时间,那么游戏就变成了非零和游戏,两个玩家要么协助,要么背叛对方。在游戏中,囚犯唯一担心的似乎是增加自己的奖励。这个问题有趣的对称性在于,逻辑决定导致每个人都背叛对方,即使他们没有背叛对方。
In the regular version of this game, collaboration is dominated by betrayal, and as a result, the only possible outcome of the game is for both prisoners to betray the other. Regardless of what the other prisoner chooses, one will always gain a greater payoff by betraying the other. Because betrayal is always more beneficial than cooperation, all objective prisoners would seemingly betray the other if operating purely rationally. However, in reality humans display a systematic bias towards cooperative behavior in Prisoner's dilemma and similar games, much more so than predicted by a theory based only on rational self interested action.
在这个游戏的常规版本中,合作以背叛为主,因此,游戏的唯一可能结果是两个囚犯都背叛对方。无论另一个囚犯选择什么,一个人总是会通过背叛另一个人获得更大的回报。因为背叛总是比合作更有利,所以如果完全理性地行动,所有客观的囚犯似乎都会背叛对方。然而,在现实中,人类在囚徒困境和类似的游戏中表现出对合作行为的系统性偏见,比仅基于理性自利行为的理论所预测的要严重得多。
In the extended form game, the game is played over and over, and consequently, both prisoners continuously have an opportunity to penalize the other for the previous decision. If the number of times the game will be played is known, the finite aspect of the game means that by backward induction, the two prisoners will betray each other repeatedly.
在扩展形式的博弈中,博弈会反复进行,因此,两名囚犯都有机会因之前的决定而惩罚对方。如果知道博弈的次数,那么博弈的有限性意味着,通过反向归纳,两名囚犯将反复背叛对方。
In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games, for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation.
在日常使用中,“囚徒困境”这个标签可能适用于不严格符合经典或迭代博弈的正式标准的情况,例如,两个实体可以从合作中获得重要利益,或者因合作失败而遭受损失,但发现协调他们的活动以实现合作只是困难或昂贵,而不一定不可能。