This is a true story which includes an illustration of how interesting statistical questions can arise in simple everyday life. It’s a bit long though, so I’ll break it down into two posts. In this one, I’ll give you the background information. In a subsequent post, I’ll discuss a possible solution to the problem that arises.
As many of you know, I live in Italy. Actually, in a small town called Belluno in the north-east of Italy, on the edge of the dolomites. It’s great, but like most people’s life journey, my route here hasn’t been straightforward.
I grew up on a dismal overflow council estate called Leigh Park, on the distant outskirts of Portsmouth. Leigh Park was once the largest council estate in Europe and, according to this article, “could well be the origin of all chavs upon this earth”. (Just in case you’re unfamiliar with the term chav, Wikipedia gives this definition: “A pejorative epithet used in the United Kingdom to describe a particular stereotype of anti-social youth dressed in sportswear”. Explains a lot, right?)
Anyway, the other day I had to take my son to the dentist in Belluno for a check-up. The waiting area in the dentist’s has recently been refurbished, and they’ve installed a large-screen TV on the main wall. But instead of showing anything interesting, the TV just seems to flip through random images: pictures of animals; of paintings; of architecture; of cities; of people; of pretty much anything. It’s probably meant to be soothing or distracting while you’re waiting for your teeth to be drilled.
So, I sat down and started looking at this TV. And the first image I saw was of a single-decker bus with destination Leigh Park (!), a little bit like this…
My first thought, obviously, was that this was a remarkably unlikely coincidence: a TV screen in Belluno, Italy, showing the image of a bus heading towards the completely unremarkable housing estate I grew up on in England. But we’ve discussed this kind of issue before: our lives are filled with many random events each day, and we only notice the ones that are coincidences. So though it seems improbable that something like this could occur, it’s much less improbable when you balance it against the many, many unremarkable things in a day which also occur.
But the main theme of the story is something different…
I wanted to point out this coincidence – which connects to part of his own family history – to my son, but by the time I managed to drag his attention away from playing on his phone, the image had changed to something else. Having nothing better to do – my potential company for this visit was just playing on his phone, remember – I kept watching the TV screen. Now, although the images kept changing, I noticed after a while that some of the images had repeated. Not in any systematic order, but apparently at random. So, my best guess is that the screen was showing images from a fixed library of pictures in a random order. As such, the image of the Leigh Park bus would show up again at some point, but the time it would take to show up would depend on the size of the library of images. If there were just a few pictures, it would probably show up again very soon; if there were very many pictures, it would most likely take a long time.
So, here’s the question:
How could I estimate the number of images in the library being shown randomly on the TV?
This seems like a reasonable sort of question to ask. I have some data – a list of images I’ve observed, together with counts of the numbers of repeats I’ve seen. And the more often I see repeats, the smaller I might expect the number of available images to be. But what strategy could I use, either based on the images I’d already observed, or by also observing extra images, to estimate the number of images in the entire library?
I have an idea of how to tackle this problem, and I’ll discuss it in a future post. But I don’t think my idea is likely to be the best approach, and I’d be interested if anyone else has an alternative, which might well prove to be better. So, please think about this problem, and if you have suggestions of your own, please send them to me at firstname.lastname@example.org. I’ll include discussion of any ideas I receive in the subsequent post.