Between 23 and 7.8 billion

Some of you might remember that a while back I gave a talk at an offsite where I used the classic birthday problem as a motivating example when discussing the potential pitfalls when of looking back at data and identifying coincidences that seem too unlikely to have occurred by chance.

The problem is this: what’s the least number of  people you need in a room for there to be a  50% chance or more that at least 2 people have the same birthday? And the answer, which seems surprisingly low to most people, is 23.

Anyway, Sunday is November 11th, so happy birthday to Harry.Hill@smartodds.co.uk, Rickie.Reynolds@smartodds.co.uk , willfletcher1111@gmail.com. And me.

In honour of this occasion I’d like to give you a variant of the classic birthday problem:

What’s the least number of people you need in a room for there to be a 50% chance or more that everyone in the room has the same birthday as someone else in the room?

They don’t all have to have the same single birthday, but there must be no one in the room who has a unique birthday.

So, just to be clear: with 23 people we know there’s a 50% chance that at least 2 people will share a birthday. But it’s very unlikely in that case that everyone in the room shares a birthday with someone else in the room. On the other hand, if we squeezed the whole population of the world – 7.8 billion people – into a single room, it’s pretty much guaranteed that everyone will share a birthday with someone else in the room. So to get a 50% chance that everyone in the room shares a birthday, we’ll have to fill it with somewhere between 23 and 7.8 billion people. But how many?

The exact calculation isn’t very easy, so I’m not expecting you to actually do it. But I am interested in what you might guess this number to be. So, can I ask you please to have a guess and send me your answer via this survey? The answers will be anonymous, but I think it might be interesting to see how accurate, collectively, we are at guessing the answer a problem of this type. So, please don’t be shy, just click to the survey, enter your best guess, and hit return. In a future post I’ll give the actual answer and summarise the answers that you send me.

8 thoughts on “Between 23 and 7.8 billion

    1. hi paul…

      close, but i think the answer is 3064. see here

      chances are you just need to do a longer monte-carlo run to reduce sampling error. thanks for replying though.

  1. Thanks for the correction. I verified the correct value of 3064 using the C++11 uniform distribution RNG. Regarding the speed of the simulation, the problem seems like a great candidate for multithreading although I haven’t tried that yet. For example, we could use five threads with one thread responsible for checking there are no unique birthdays for people born between day 1 and day 73, another thread for checking no unique birthdays for people born between day 74 and day 146 etc.
    And, more significantly, different simulations could be run concurrently on different threads or different cores..
    Great problem which was new to me. It’s known as the “strong birthday problem” apparently.
    Thanks for sharing.

    Paul

    1. Hi Paul, thanks for the reply and the extra details. It’s a really nice idea using this exercise to illustrate multi-threading techniques, and if I was teaching that sort of stuff, it would be fun to do it that way. As it is, I just wanted to get people to try to use intuition to try to best guess what the true answer might be. As I wrote here the true answer is given in an academic paper, but to be honest I wasn’t able to prove that formula myself, so I wrote a quick program in R to verify the solution (3064) for a year of 365 days. (The formula is more general, obviously). So, since it was just a one-off calculation, I was happy with a non-efficient simulation routine that took just a few minutes to write, even if it then took a couple of minutes to run to get a decent precision on the answer. (Not that I’d be competent to write an efficient version, mind :-)).

      Yes, I saw that it was called the strong birthday problem, though I think there might be more than one problem with a claim to that name anyway. Will have to dig out the other versions for the post on my birthday this year 🙂

      Thanks again for your interest. Not sure if you’re following the blog more generally, but if you are I hope some of the other posts are of interest to you too.

      Best wishes…

      Stuart

Leave a Reply