by Mark R. Waser (originally appeared Dec. 10, 2012 at Transhumanity.Net)
You’re appearing on the “hottest new game show” Money to Burn. You’ll be playing two rounds of a game theory classic against the host with a typical “Money to Burn” twist. If you can earn more than $100, you get to move on to the next round.
One of the current problems in (and arguably with) game theory is the so-called “centipede game” (Rosenthal 1981) – where the so-called logical/rational/correct/”optimal” strategy produces virtually the worst possible result. While there are many variants of the centipede game, they all take the form of players alternately choosing between increasing the pot and continuing the game –or– “cashing out” by taking the majority of the pot and ending the game. The “problem” posed by the game is that the current player has to choose between “selfishly” taking a sure thing or trusting the other player to be unselfish – in a game where a finite ending guarantees that one of the two will end up with less than they could have gotten if they’d just ended the game a turn earlier.
In the most common variant, the game starts with two piles of money lying on the table – one pile with four one-dollar bills and a second “pile” with a single one-dollar bill. The first player has two options, to stop or to pass. If he stops, the game ends and he gets the larger pile ($4) while the second player gets the remaining pile ($1). If he continues, the two piles are doubled to $8 and $2, and the second player is faced with a similar decision with twice the stakes: either to take the $8 pile, thus ending the game and leaving the $2 pile for the first player, or to let the piles double again and let the first player choose again. This continues for at most six choices. If neither of the players has stopped by then, the first player gets $256 and the second player gets $64. This is diagrammed below.
You are going to play this variant twice (once going second and once going first) against the game show host – who, in standard Money To Burn fashion, is going to literally burn any of the money that he wins. To make it a true logic puzzle (rather than a guessing game), you must first write down your strategy for both of the games and then the host is required to make the decisions that result in his burning the greatest amount of money (in much the same way that a blackjack dealer must “hit” on 16 or below and “stay” on 17 or above).
While this game offers both players a very profitable opportunity ($256/$64 or $32/$128), backward induction (one of the oldest ideas in game theory) predicts/dictates that the first player will/should stop at the first opportunity and win just $4. The reasoning behind this can be described as follows. On the last turn, player two has a choice between stopping and getting $128 and passing and getting only $64. “Obviously”, s/he will choose to stop. If this is true then, on the penultimate turn, player one really just has the choice between stopping for $64 or passing and getting only $32. Of course, this then means that, on the turn before, the second player just has the choice between $32 and $16 . . . and so on . . . until, on the first turn, the first player has a choice between $4 and $2 (and chooses $4).
The large discrepancy between the profits that intuition says are possible and what the “rationality” of backward induction dictated induced Rosenthal (in the very same paper in which he introduced the centipede game) to propose an alternative to the game theoretic approach in the hope of obtaining predictions more in line with intuition. And, indeed, one can certainly sympathize/agree with those individuals reported (Aumann 1992) to find the backward induction outcomes so disturbing that “if this is rationality, they want none of it” (pg. 218). So, it should come as no surprise that numerous subsequent papers were quickly able to demonstrate examples where violating backward induction is perfectly rational (due to the believed probability that the other player was unlikely to adhere strictly to the dictates of backward induction).
And, indeed, with the twist in the game show variant (that the game show host knows what your strategy is and *must* act to ensure that he burns the most money possible), a strategy of always passing will net you $32 + $64 = $96 (much higher – but still $4 short!). If you were player 1 first, you could make $256 + $64 = $320 with a strategy where you will only pass on your last choice as player 2 if the game show host passed on his last choice as player 2. But, of course, you start as player 2 and know that the host must break any promises that he makes which result in his burning less money.
But, what happens when players are perfectly rational and/or believe that the other player is likely to adhere strictly to the dictates of backward induction? Palacios-Huerta and Volij (2009) performed an experiment comparing when chess players played chess players, students played students, and chess players and students played each other. When chess players played chess players in one-shot games, 69% of the games ended with the first node. This percentage escalated to 100% when the first player was a Grand Master. Further, when chess players played ten repetitions, every game after the fourth repetition ended with the first node.
This is in marked contrast to the case of students playing students – where only 3% of the games ended at the initial node and there was no change to this behavior with repetition. Yet, in the treatments where students faced chess players as player 1, ten times as many of the games (30%) ended at the initial node and this percentage rose to 70% with repetition. And, when chess players faced students as the first player, they ended the games at the first node only 37.5% of the time (over 45% less than when they played other chess players). Thus, student players playing each other make the most money, followed by mixed competitions, with chess players playing each other making the least.
Aumann (1995) proves that it is “common knowledge of rationality” (rather than “rationality” itself) that causes the backward induction outcome to be reached. Yet, I would vigorously dispute that it is “rationality” which was the common knowledge since it causes the most “rational” to have the worst results (earn the least money). Indeed, it is at this point, I contend, that backward induction should have been relegated to the “correctness” category of “self-fulfilling prophecy” rather than the category of “game-theoretically *optimal*”. Yet, attempts to “prove” the “correctness” of backward induction through counterfactual conditionals continue to be made by Aumann (1998) and the LessWrong “rationalists”.
Please don’t spoil the fun for others by posting successful solutions. If you wish to solve the puzzle by yourself, don’t read the next article in this series until you have done so or given up.
In the next article, I will reveal and explain the solution that allows advancement to the next round of Money to Burn. Then, I will show how that leads to a rational argument that will cause even the most ardent “rational” players to abandon the currently touted “optimal” strategy in favor of a newly proposed one. I will be touching upon whether it is true that the Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent and Does Unfairness Triumph After All? And, as usual, I’ll tie this all back to the current “existential risk/AGI enslavement” movement and what we should be doing instead.
Thanks go to Julie Waser for helping me “Make it so that a real person can understand the puzzle.”
Aumann R (1992) Irrationality in Game Theory. In Dasgupta P Gale D Hart O Maskin E (Eds) Economic Analysis of Markets and Games: Essays in Honor of Frank Hahn, 214-27. MIT Press.
Aumann RJ (1995) Backward Induction & Common Knowledge of Rationality. Games and Economic Behavior, 8(1): 6-19.
Aumann RJ (1998) On the Centipede Game. Games and Economic Behavior, 23(1): 97-105. http://www.ma.huji.ac.il/raumann/pdf/Centipede%20Game.pdf
Palacios-Huerta I Volij O (2009) Field Centipedes. American Economic Review 99(4):1619-1635. http://www.palacios-huerta.com/docs/Field%20Centipedes.pdf
Rosenthal RW (1981) Games of Perfect Information, Predatory Pricing & Chain Store Paradox. Journal of Economic Theory, 25(1): 92-100.