Monkey business

Stick a monkey on a typewriter, let him hit keys all day, and what will you get? Gibberish, probably. But what if you’re prepared to wait longer than a day? Much longer than a day. Infinitely long, say. In that case, the monkey will produce the complete works of Shakespeare. And indeed any and every other work of literature that’s ever been written.

This is from Wikipedia:

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, such as the complete works of William Shakespeare.

Infinity is a tricky but important concept in mathematics generally. We saw the appearance of infinity in a recent post, where we looked at the infinite sequence of numbers

1, 1/2, 1/4, 1/8,….

and asked what their sum would be. And it turned out to be 2. In practice, you can never really add infinitely many numbers, but you can add more and more terms in the sequence, and the more you add the closer you will get to 2. Moreover, you can get as close to 2 as you like by adding sufficiently many terms in the sequence. It’s in this sense that the sum of the infinite sequence is 2.

In Statistics the concept of infinity and infinite sums is equally important, as we’ll discuss in a future post. But meantime… the infinite monkey theorem. What this basically says is that if something can happen in an experiment, and you repeat that experiment often enough, then eventually it will happen.

Sort of. There’s still a possibility that it won’t – the monkey could, by chance, just keep hitting the letter ‘a’ totally at random forever, for example – but that possibility has zero probability. That’s the ‘almost surely’ bit in the Wikipedia definition. On the other hand, with probability 1 – which is to say complete certainty – the monkey will eventually produce the complete works of Shakespeare.

Let’s look at the calculations, which are very similar to those in another recent post.

There are roughly 50 keys on a keyboard, so assuming the monkey is just hitting keys at random, the probability that the first key stroke matches the first letter of Shakespeare’s works is 1/50. Similarly, the probability the second letter matches is also 1/50. So to get the first two matching it’s

1/50 \times 1/50

Our monkey keeps hitting keys and at each new key stroke, the probability that the match-up continues is multiplied by 1/50. This probability gets small very, very quickly. But it never gets to zero.

Now, if the monkey has to hit N keys to have produced a text as long as the works of Shakespeare, by this argument he’ll get a perfect match with probability

p=(1/50)^N

This will be a phenomenally small number. Virtually zero. But, crucially, not zero. Because if our tireless monkey repeats that exercise a large number of times, let’s say M times, then the probability he’ll produces Shakespeare’s works at least once is

Q =   1-(1-p)^M

And since p is bigger than zero – albeit only slightly bigger than zero –  then Q gets bigger with N. And just as the sum of the numbers 1, 1/2, 1/4, … gets closer and closer to 2 as the number of terms increases, so Q can be made as close to 1 as we like by choosing M large enough.

Loosely speaking, when M is infinity, the probability is 1. And even more loosely: given an infinite amount of time our monkey is bound to produce the complete works of Shakespeare.


Obviously, both the monkey and the works of Shakespeare are just metaphors, and the idea has been expressed in many different forms in popular culture.  Here’s Eminem’s take on it, for example:

 

One-in-a-million

Suppose you can play on either of 2 slot machines:

  1. Slot machine A pays out with probability one in a million.
  2. Slot machine B pays out with probability one in 10.

Are you more likely to get a payout with one million attempts with slot machine A or with 10 attempts on slot machine B?

Have a think about this before scrolling down.

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

I was prompted to think about this question by the following tweet, which includes both the answer and the relevant calculations.

So, there’s a bigger probability (0.65) that you’ll get a payout from 10 spins of slot machine B than from a million spins of slot machine A (probability 0.63).

Hopefully, the calculations above are self-explanatory. But just in case, here’s the detail. Suppose you have N attempts to win with a slot machine that pays out with probability 1/N.

  1. First we’ll calculate the probability of zero payouts in the N spins.

2. This means we get a zero payout on every spin.

3. The probability of a zero payout on one spin is one minus the probability of a win: 1 – 1/N.

4. So the probability of no payout on all the spins is

  (1-1/N)^N

  5. And the probability of at least one payout is

1- (1-1/N)^N

As explained in the tweet, with N=10 this gives 0.65 and with N=1,000,000 it gives 0.63. The tweet’s author explains in a follow-up tweet that he was expecting the same answer both ways.

But as someone in the discussion pointed out, that logic can’t be right. Suppose you had one attempt with slot machine C which paid out with probability 1. In other words, N=1 in my example above. Then, of course, you’d be bound to get a payout, so the probability of at least one payout is 1. So, although it’s initially perhaps surprising that you’re more likely to get a payout with 10 shots at slot machine B than with a million shots at slot machine A, the dependence on N becomes obvious when you look at the extreme case of slot machine C.


Footnote: What does stay the same in each case however is the average number of times you will win. With N shots at a slot machine with win probability 1/N, you will win on average once for any choice of N. Sometimes you’ll win more often, and sometimes you may not win at all (except when N=1). But the average number of wins if you play many times will always be 1.

Sleight of hand

cards3

A while ago I sent a post with a card trick which I explained had a statistical element to it, and asked you to try to work out how it was done. Thanks to those of you who wrote to me with variants on the correct answer.

The rules of the game were that my assistant, Matteo, chose a card at random hidden from me. It happened to be a 5 in the video. I then turned the cards over one at a time and Matteo had to play a counting game. Once he reached the 5th card, he noted its value, which was a 10. So he then counted another 10 cards in the sequence, noted the value of that card, and so on until we ran out of cards. Matteo had to remember the final card in his sequence before the cards ran out, which turned out to be the eight of diamonds. My task as the magician was to predict what Matteo’s final card was, which I did successfully.

Now, there are 2 reasons why this is a statistical card trick.

  1. It doesn’t always work. It does so with a reasonably high probability, but depending on the  configuration of the cards once they are shuffled, won’t always. I’ll be honest: we had to remake the video several times, but that was always due to my incompetence in explaining the trick and not because it ever failed. Still, it won’t always work.
  2. The second reason it’s a statistical trick is in its execution. The way it works is that I also play the same counting game as Matteo, but starting with the value of the first card I turn over, which happened to be a 10. So, we’re both playing the same counting game but from different starting points. Matteo’s starting point is 5, mine is 10. Although we start from different places, it turns out to be quite likely – though not certain – that the counting sequences we follow will overlap at some point. And once they do overlap, we are then following exactly the same sequence and so will arrive at the same final card.

Technically, the sequences of cards Matteo and I are both following are called Markov chains. These are sequences of random numbers such that in order to understand what the next card might be I only need to know the value of the current card, without knowing the past sequence that took me to the current state. In other words, when Matteo has to start counting 10 cards, it doesn’t matter how he got to that position, just that that’s where he currently is. And I also generate my own Markov chain. With an unlimited number of cards in a pack, the mathematical properties of Markov chains would guarantee that our sequences  meet at some point, after which we would be following exactly the same sequence, leading me to have the same final card as Matteo. With just 52 cards in a pack, there’s no guarantee, which is why the trick won’t always work.

The fact that the trick might not work is a little undesirable, but you can increase the chances by counting picture cards as 1 rather than 10. This forces the sequences to change card more often, which increases the chances of our two sequences overlapping.


Markov chains are actually really important building blocks for modelling in many areas of Statistics, which is one reason why I used posted the card trick to the blog. I’ll use a future post to explain this though.

 

Magic

Here’s a statistical card trick. As I try to explain in the video, admittedly not very clearly, the rules of the trick are as follows:

  1. Matteo picks a card at random from the pack. This card is unknown to me.
  2. I shuffle the cards and turn them over one at a time.
  3. As I turn the cards over, Matteo counts them in his head until he reaches that number in the sequence. As you’ll see, his card was a 5, so he counts the cards until he reaches the 5th one.
  4. He then repeats that process, starting with the value of the 5th card, which happened to be a 10. So, he counts – again silently – a further 10 cards. He remembers the value of that card, and counts again that many cards.
  5. And so on until we run out of cards.
  6. (Picture cards count as 10.)
  7. Matteo has to remember the last card in his sequence before all of the cards ran out.
  8. And I – the magician – have to predict what that card was.

Now take a look at the video….

How did I do it? And what’s it got to do with Statistics? I’ll explain in a future post, but as usual if you’d like to write to me with your ideas I’ll be very happy to hear from you.

Massively increase your bonus

In one of the earliest posts to the blog last year I set a puzzle where I suggested Smartodds were offering employees the chance of increasing their bonus, and you had to decide whether it was in their interests to accept the offer or not.

<They weren’t, and they still aren’t, but let’s play along>.

Same thing this year, but the rules are different. Eligible employees are invited to gamble their bonus at odds of 10-1 based on the outcome of a game. It works like this…

For argument’s sake, let’s suppose there are 100 employees that are entitled to a bonus. They are told they each have the opportunity to increase their bonus by a factor of 10 by playing the following game:

  • Each of the employees is randomly assigned a number between 1 and 100.
  • Inside a room there are 100 boxes, also labelled 1 to 100.
  • 100 cards, numbered individually from 1 to 100, have been randomly placed inside the boxes, so each numbered box contains a card with a unique random number from 1 to 100. For example, box number 1 might contain the card with number 62; box number 2 might contain the card with number 25; and so on.
  • Each employee must enter the room, one a a time, and can choose any 50 of the boxes to open. If they find the card with their own number in one of those boxes, they win. Otherwise they lose.
  • Though the employees may discuss the game and decide how they will play before they enter the room, they must not convey any information to the other employees after taking their turn.
  • The employees cannot rearrange any of the boxes or the cards – so everyone finds the room in the same state when they enter.
  • The employees will have their bonus multiplied by 10 if all 100 of them are winners. If there is a single loser, they all end up with zero bonus.

Should the employees accept this game, or should they refuse it and keep their original bonuses? And if they accept to play, should they adopt any particular strategy for playing the game?

Give it some thought and then scroll down for some discussion.

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

A good place to start is to calculate the probability that any one employee is a winner. This happens if one of the 50 boxes they open, out of the 100 available, contains the card with their number. Each box is equally likely to contain their number, so you can easily write down the probability that they win. Scroll down again for the answer to this part:

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

There are 100 boxes, and the employee selects 50 of them. Each box is equally likely to contain their number, so the probability they find their number in one of the boxes is 50/100 or 1/2.

So that’s the probability that any one employee wins. We now need to calculate the probability that they all win – bearing in mind the rules of the game – and then decide whether the bet is worth taking.

In summary:

  • There are 100 employees;
  • The probability that any one employee wins their game is 1/2;
  • If they all win, their bonuses will all be multiplied by 10;
  • If any one of them loses, they all get zero bonus.

Should the employees choose to play or to keep their original bonus? And if they play, is there any particular strategy they should adopt?

If you’d like to send me your answers I’d be really happy to hear from you. If you prefer just to send me a yes/no answer, perhaps just based on your own intuition, I’d be equally happy to get your response, and you can use this form to send the answer in that case.


This is a variant on a puzzle pointed out to me by Fabian.Thut@smartodds.co.uk. I think it’s a little more tricky than previous puzzles I’ve posted, but it illustrates a specific important statistical issue that I’ll discuss when giving the solution.

Intuition

Intuition is a great asset, but:

  1. Can sometimes lead you astray;
  2. Always needs to be balanced by doubt, to avoid becoming complacency;
  3. Is never a substitute for genuine understanding and knowledge.

A while back I posed the question

You’re at a party and meet someone. After chatting for a bit, you work out that the girl you’re talking to has 2 kids and one of them is a boy. What are the chances she’s got 2 boys rather than a boy and a girl?

And I discussed the solution to this problem in a subsequent post. As I discussed, although it’s tempting to give an answer of 1/2 – arguing that the other child is equally likely to be a boy or a girl – this reasoning is wrong, and the correct answer is 1/3.

I then followed up this discussion by extending the original problem as follows. Everything is the same, except you are told that the woman has 2 children, one of whom is a boy that was born on a Tuesday. With this information, what is the probability that the other child is also a boy?

Again, it looks like there ought to be an easy answer to this problem. And again, it turns out that this easy answer is wrong. But this time the breakdown of the simple intuitive argument is often surprising even to people who work regularly with probabilities.

The intuitive – but wrong – answer is that the probability is still 1/3. The argument goes that there is no relationship between gender and day of birth. Technically,  gender and day of birth are statistically independent. So, the information provided by the day of birth is completely irrelevant and can be ignored, meaning the probability that the other child is a boy remains at 1/3.

Except it doesn’t. The true answer is 13/27, which is just slightly less than 1/2.

The calculation leading to this answer is not difficult, but slightly more complicated than is reasonable for me to include in this post. If you’re interested, you can find a full explanation here.

But, as explained in that article, it’s interesting to state the result in slightly more generality.  We added the condition that the boy was born on a Tuesday, an event which has probability 1/7 for both boys or girls. Suppose we replace this event with something else like ‘The boy is left-handed’; or ‘The boy has green eyes’; or ‘The boy was born on Christmas Day’. And suppose the probability of a child – boy or girl – fulfilling this extra condition is p.

For the original Tuesday’s child problem, p=1/7. For the Christmas birthday p=1/365. For green eyes it’s whatever the proportion of people in the population with green eyes is. And so on.

Anyway, if we replace the condition of being born on a Tuesday with some other condition whose probability is p for either boys or girls, it turns out that the probability that both the children are boys, given the information provided, is

Q = \frac{2-p}{4-p}

If we substitute p=1/7 in this expression we get Q = 13/27, which explains the answer given above. But it’s not the value itself that’s important; it’s the fact that it’s different from 1/3. So, although gender and day of birth are independent, giving the extra information about day of birth shifts the original probability from 1/3 to 13/27.

And there’s something even more interesting. If the day of birth extra condition is replaced with a condition that’s less likely, so that p is smaller, then the value of Q gets closer and closer to 1/2. For example,  with the Christmas birthday condition, p=1/365 leading to Q = 0.4996573. In other words, if we include a very unusual condition for the child known to be a boy, the probability that the other child is also a boy gets very close to  1/2, which is the answer you get to the original problem by using the wrong intuition.

Not very intuitive, but it’s the truth.


Finally, here’s Dilbert’s take on intuition:

Family problems

In an earlier post, I set the following problem:

You’re at a party and meet someone. After chatting for a bit, you work out that the girl you’re talking to has 2 kids and one of them is a boy. What are the chances she’s got 2 boys rather than a boy and a girl?

Following some feedback, I later updated that post to clarify that the question isn’t intended to be about possible slight differences in the birth rates of boys and girls. That’s an interesting biological and demographic issue, but wasn’t intended as the point of the question. For the purposes of the question I simply meant to assume that all children, regardless of a mother’s previous history of births, is equally likely to be a boy or a girl.

In that case, it’s very tempting to answer the question above as 1/2. Indeed, this was the point of the post. One child’s a boy. The other is just as likely to be a boy or a girl, so the answer must be 1/2.

Except it’s not.

The answer is 1/3, and here’s the reasoning…

Without any additional information, if a woman has a 2-child family the possibilities (with the oldest child listed first) are:

Boy- Boy, Boy-Girl, Girl-Boy, Girl-Girl

and because all children are equally likely to be male or female, these combinations are all equally likely. But we can rule out the Girl-Girl combination from the information in the problem, so the remaining possibilities are

Boy-Boy, Boy-Girl, Girl-Boy

with each being equally likely. But if you consider a boy in one of these pairs, it’s only in the first case that the other child is a boy. So, in just 1 of the 3 equally likely outcomes is the other child also a boy, so the probability is 1/3.

This simple problem illustrates the difficulty in calculating what is called a conditional probability – the probability of something conditional on some given information, in this case that one of the children is a boy. Whereas in general the chance of a child being a boy is 1/2, once you include the extra information that we’re looking at a 2-child family in which at least one of the children is a boy, the probability changes. At first it seems counter-intuitive, but once you’ve seen a few problems of this type, your intuition becomes sharper.

With that in mind, let me pose the same problem as above, but suppose you find out that one of the woman’s kids is a boy that was born on a Tuesday. Now what’s the probability that the other child is also a boy?

As usual, I’ll write a future post with the solution and discussion. But if you’d like to send me your own ideas, either by mail or via this survey form, I’d be really happy to hear from you.


Thanks to those of you who replied to the original question. Apart from some initial confusion about whether I was suggesting boys might be more or less common than girls in general, there was roughly a 50-50 split in answers between 1/2 and 1/3. As explained above, it’s easy to be misled into thinking the answer is 1/2, so there is no embarrassment in arriving at that answer. And like I say, that was the whole point of the post. Nonetheless, well done to those of you who got the correct answer of 1/3.

It’ll be interesting to see what replies I get to the revised problem above, so please do send me your answer or thoughts on the problem if you have time.

About a boy

You’re at a party and meet someone. After chatting for a bit, you work out that the girl you’re talking to has 2 kids and one of them is a boy. What are the chances she’s got 2 boys rather than a boy and a girl?

Actually, I really want to ask a slightly more complicated question than this. But let’s take things slowly. Please think about this problem and, if you have time, mail me or send me your answer via this form. Subsequently, I’ll discuss the answer to this problem and ask you the slightly more complicated question that I’m really interested in.


Quick update: just to be clear, assume that all children are equally likely to be born male or female, and that this doesn’t change even if a mother has already given birth to previous children of known gender.

66,666 Random Numbers, Volume 1

A while ago I posted about gambling at roulette, and explained that whatever strategy you adopt – excluding the possibility of using electronic equipment to monitor the wheel and ball speeds, and improve prediction of where the ball will land – no strategy can overcome the edge that casinos offer by giving unfavourable odds on winning outcomes. Now, believe it or not, I do a fair bit of research to keep this blog ticking over. And in the course of doing the research for a potential casino/roulette post, I came across this book:

That’s right: 66,6666 random numbers. But not just any numbers: the numbers on a roulette wheel, 0-36. The numbers are also colour coded as per a standard roulette wheel. Here’s a typical page:

 

But there’s more:

  1. The book includes a bonus set of an extra 10,000 random numbers. (Question: why not just call the book 76,666 random numbers?)
  2. There’s also an American Edition, which is almost identical, but accounts for the fact that in American casinos, the wheel also includes a 00.
  3. This is just Volume 1. Further volumes don’t seem to have gone into production yet, but the title suggests it’s just a matter of time.

Now, tables of random numbers have their place in history. As explained in an earlier post, simulation is a widely-used technique in statistical analysis, when exact mathematical calculations for statistical problems are too complicated. And before computers were widely available, it was commonplace to use tables of random digits as the basic ingredient for simulation routines.

But, hello! This is 2019. Chances are there’s a reasonable random number generator in the calculator on your phone. Or you can go here and fiddle around with the settings in the dialog box. Or you can fire-off 66,666 random numbers with a one-line code in R or any other statistical language. You can even do it here:

# simulate the results numbers <- sample(0:36, 66666, replace=T) # tabulate the results table(numbers) # show results as a barplot library(ggplot2) df<-data.frame(table(numbers)) colnames(df)<-c('number','frequency') ggplot(data=df, aes(x=number, y=frequency)) + geom_bar(stat="identity", width=0.5, fill='lightblue') +ggtitle('Frequencies of Results in 66,666 Roulette Spins')

Just hit the ‘run’ button. This may not work with all browsers, but seems to work ok with Chrome.

The simulation is all done in the first non-comment line. The rest is just some baggage to tabulate the frequencies and show them graphically.

This approach has the advantages that:

  1. You get different numbers every time you repeat the exercise, just like in real life;
  2. The numbers are stored electronically, so you can analyse them easily using any statistical functions. If you ran the R session above, you’ll have seen the results in tabulated summary form, as well as in a barplot, for example. But since the data are stored in the object ‘numbers’, you can do anything you like with them. For example, typing ‘mean(numbers)’ give you the mean of the complete set of simulated spins.

So, given that there are many easy ways you can generate random numbers, why would anybody possibly want to buy a book with 66,666 random numbers? Well, here’s the author to explain:

After gaining a moderate amount of experience playing roulette, I discovered how easy it was to master the rules of the game – and still lose!

He goes on…

Having lost my bankroll and now distrusting my knowledge of statistics as they pertained to roulette, I scoured the Internet for more information on the game. My findings only confirmed what I already knew: that statistics can only define the shape and character of data and events that have already taken place and have no real bearing over the outcome of future spins.

And finally…

I chose to compile a book of 66,666 random numbers for two reasons: One, I’ve paid my dues – literally, I’ve lost thousands of dollars playing this game, and I don’t want you to suffer the same consequence; two, as roulette is a game played against the house and not against my fellow gamblers, I knew I wanted to provide you with the same opportunity to study these numbers and learn something that might just make a difference in the way you play the game.

In summary, despite having lost a fortune believing there is some system to win at roulette, and despite sincerely wishing that you avoid the same fate, having learned through experience that no roulette system can possibly work, the author has provided you with 66,666 spins (plus a bonus 10,000 extra spins) of a roulette wheel so that you can study the numbers and devise your own system.(Which is bound to fail and almost certainly cost you a fortune if you try to implement it).

Now, just to emphasise:

  1. The random properties of a roulette wheel are very simply understood from basic probability;
  2. A study of the outcome of randomly generated spins of a roulette wheel is a poor substitute for these mathematical properties;
  3. Biases in the manufacture or installation of a roulette wheel, which could make some numbers, or sequences of numbers, more frequent than others, are likely to be vanishingly small. But if there were such biases, you’d need to study a very long series of the outcomes of that particular wheel to be able to exploit them;
  4. You might choose to play roulette for fun. And you might even get lucky and win. But it is a game with negative expected winnings for the gambler, and if you play long enough you will lose with 100% certainty.

However, we’ve seen a similar mis-use of simulation before. In this post a newspaper did 100 random simulations of the NBA lottery draft in order to predict the lottery outcome. The only difference with the roulette simulation is that 66,666 is a rather bigger number – and therefore greater waste of time – than 100.

Moral: simulation can often be avoided through a proper understanding of the randomness in whatever process you are studying. But if you really have to simulate, learn the basics of a language like R; don’t waste time and money on books of random tables.

By coincidence

dna2

In an earlier post I suggested we play a game. You’d pick a sequence of three outcomes of a coin toss, like THT. Then I’d pick a different triplet, say TTT. I’d then toss a coin repeatedly and whoever’s chosen triplet showed up first in the sequence would be the winner.

In the post I gave the following example…

H T H H H T T H T H …..

… and with that outcome and the choices above you’d have won since your selection of THT shows up starting on the 7th coin toss, without my selection of TTT showing up before.

The question I asked was who this game favoured. Assuming we both play as well as possible, does it favour

  1. neither of us, because we both get to choose and the game is symmetric? Or;
  2. you, because you get to choose first and have the opportunity to do so optimally? Or;
  3. me, because I get to see your choice before I have to make mine?

The answer turns out to be 3. I have a big advantage over you, if I play smart. We’ll discuss what that means in a moment.

But in terms of these possible answers, it couldn’t have been 2. Whatever you choose I could have chosen the exact opposite and by symmetry, since H and T are equally likely, our two triplets would have been equally likely to occur first in the sequence. So, if you choose TTT, I choose HHH. If you choose HHT, I choose TTH and so on. In this way I don’t have an advantage over you, but neither do you over me. So we can rule out 2 as the possible answer.

But I can actually play better than that and have an advantage over you, whatever choice you make. I play as follows:

  1. My first choice in the sequence is the opposite of your second choice.
  2. My second and third choices are equal to your first and second choices.

So, if you chose TTT, I would choose HTT. If you chose THT, I would choose TTH. And so on. It’s not immediately obvious why this should give me an advantage, but it does. And it does so for every choice you can make.

The complete set of selections you can make, the selections I will make in response, and the corresponding odds in my favour are given in the following table.

Your Choice My Choice My win odds
HHH THH 7:1
HHT THH 3:1
HTH HHT 2:1
HTT HHT 2:1
THH TTH 2:1
THT TTH 2:1
TTH HTT 3:1
TTT HTT 7:1

As you can see, your best choice is to go for any of HTH, HTT, THH, THT, but even then the odds are 2:1 in my favour. That’s to say, I’m twice as likely to win as you in those circumstances. My odds increase to 3:1 – I’ll win three times as often as you – if you choose HHT or TTH; and my odds are a massive 7:1 – I’ll win seven times as often as you – if you choose HHH or TTT.

So, even if you play optimally, I’ll win twice as often as you. But why should this be so? The probabilities aren’t difficult to calculate, but most are a little more complicated than I can reasonably include here. Let’s take the easiest example though. Suppose you choose HHH, in which I case I choose THH. I then start tossing the coins. It’s possible that HHH will be the first 3 coins in the sequence. That will happen with probability 1/2 x 1/2 x 1/2 =1/8. But if that doesn’t happen, then there’s no way you can win. Because the first time HHH appears in the sequence it will have had to have been preceded by a T (otherwise HHH has occurred earlier). In which case my THH occurs before your HHH. So, you would have won with probability 1/8, and therefore I win with probability 7/8, and my odds of winning are 7:1.

Like I say, the other cases – except when you choose TTT, which is identical to HHH, modulo switching H’s and T’s – are a little more complicated, but the principle is essentially the same every time.

By coincidence, this game was invented by Walter Penney (Penney -> penny, geddit?), who published it in the Journal of Recreational Mathematics in 1969. It’s interesting from a mathematical/statistical/game-theory point of view because it’s an example of a non-transitive game. For example, looking at the table above, HHT is inferior to THH; THH is inferior to TTH; TTH is inferior to HTT; and HTT is inferior to HHT. Which brings us full circle. So, there’s no overall optimal selection. Each can be beaten by another, which in turn can be beaten by a different choice again. This is why the second player has an advantage: they can always find a selection that will beat the first player’s. It doesn’t matter that their choice can also be beaten, because the first player has already made their selection and it wasn’t that one.

The best known example of a non-transitive game is Rock-Paper-Scissors. Rock beats scissors; scissors beats paper; paper beats rock. But in that case it’s completely deterministic – rock will always beat paper, for example. In the Penney coin tossing game, HTT will usually beat TTT, but occasionally it won’t. So, it’s perhaps better defined as a random non-transitive game.

The game also has connections with genetics. Strands of DNA are long chains composed of sequences of individual molecules known as nucleotides. Only four different types of nucleotide are found in DNA strands, and these are usually labelled A, T, G and C. The precise ordering of these nucleotides in the DNA strand effectively define a code that will determine the characteristics of the individual having that DNA.

It’s not too much of a stretch of the imagination to see that a long sequence of nucleotides A, T, G and C is not so very different – albeit with 4 variants, rather than 2 – from a sequence of outcomes just like those from the tosses of a coin. Knowing which combinations of the nucleotides are more likely to occur than others, and other combinatorial aspects of their arrangements, proves to be an important contribution of Statistics to the understanding of genetics and the development of genetic intervention therapies. Admittedly, it’s not a direct application of Penney’s game, but the statistical objectives and techniques required are not so very different.


Thanks to those of you who wrote to me with solutions to this problem, all of which were at least partially correct.