Last year’s Fifa© world cup Panini sticker album had spaces for 682 stickers. Stickers were sold in packs of 5, at a cost of 80 pence per pack. How much was it likely to cost to fill the whole album? Maybe have a guess at this before moving on.















Well, to get 682 stickers you need 137 packs, so the obvious (but wrong) answer is 137 times 80 pence, which is ￡109.60. It’s wrong, of course, because it doesn’t take into account duplicate stickers: as the album fills up, when you buy a new pack, it’s likely that at least some of the new stickers will be stickers that you’ve already collected. And the more stickers you’ve already collected, the more likely it is that a new pack will contain stickers that you’ve already got. So, you’re likely to need many more than 137 packs and spend much more than ￡109.60. But how much more?
It turns out (see below) that on average the number of packs needed can be calculated as
where the “…” means “plus all the terms inbetween”. So the next term in the sequence you have to add is 682/679 and then 682/678 and so on, all the way down to the final term in the sequence which is given as 682/1.
So the average cost of filling the album is around pence, or £775. You can probably also guess how this calculation changes if the number of spaces in the album were different from 682 or if the number of stickers per pack were different from 5.
Well, actually, there’s a small mistake in this calculation. Strictly speaking, when you buy packs of 5 stickers, none of the stickers in a pack will be duplicates among themselves. The above calculation ignores this fact, and assumes that duplicates could occur within packs. However, it turns out that doing the mathematics more carefully – which is quite a bit more complicated – leads to a notverydifferent answer of £773. So, we might have simplified things in our calculation of £775, but we didn’t lose much in terms of accuracy.
Anyway, a question that’s just as interesting as the accuracy of the answer is what the value of £775 means in practice. Though it’s the average value that would be spent by many collectors in filling the album, the actual experience of any individual collector might be quite different from this. The mathematics is more complicated again in this case, but we can avoid the complexity by simulating the process. The figure below shows a histogram of the number of packs needed to fill the album in a simulation of 10,000 albums.
So, for example, I needed roughly 800 packs to complete the album in around 1500 of the simulated albums. Of course, the average number of packs needed turns out to be close to the theoretical average of 969. But although sometimes fewer than this number were needed, the asymmetry of the histogram means that on many occasions far more than the average number was needed. For example, on a significant number of occasions more than 1000 packs were needed; on several occasions more than 1500 packs were needed; and on a few occasions more than 2000 packs were needed (at a cost of over £1600!). By contrast, there were no occasions on which 500 packs were sufficient to complete the album. So, even though an average spend of £775 probably sounded like a lot of money to fill the album, any individual collector might need to spend as much as £2000 or more, while all collectors would have need to spend at least £400.
This illustrates an important point about Statistics in general – an average is exactly that: an average. And individual experiences might differ considerably from that average value. Moreover, asymmetry in the underlying probability distribution – as seen in the histogram above – will imply that variations from the average are likely to be bigger in one direction than the other. In the case of Panini sticker albums, you might end up paying a lot more than the average of £775, but are unlikely to spend very much less.
To be fair to Panini, it’s common for collectors to swap duplicate stickers with those of other collectors. This obviously has the effect of reducing the number of packs needed to complete the album. Furthermore, Panini now provide an option for collectors to order up to 50 specific stickers, enabling collectors who have nearly finished the album to do so without buying further packs when the chance of duplication is at its highest. So for both these reasons, the expected costs of completing the album as calculated above are overestimates. On the other hand, if certain stickers are made deliberately rarer than others, the expected number of packs will increase! Would Panini do that? We’ll discuss that in a future post.
Meantime, for maths enthusiasts, and just in case you’re interested, let’s see where the formula
comes from. You might remember from an earlier post, that if I repeat an experiment that has probability p of success until I get my first success, I will have to repeat the experiment an average of 1/p times. Well, buying new stickers until I get one that’s different from those I’ve already collected is an experiment of exactly this type, so I can use this result. But as the number of stickers I’ve already collected changes, so does the probability of obtaining a different sticker.
 At the start, I have 0 stickers, so the probability the next sticker will be a new sticker is 682/682, and the expected number of stickers I’ll need till the next new sticker is 682/682. (No surprises there.)
 I will then have 1 sticker, and the probability the next sticker will be a new sticker is 681/682. So the expected number of stickers I’ll need till the next new sticker is 682/681.
 I will then have 2 different stickers, and the probability the next sticker will be a new sticker is 680/682. So the expected number of stickers I’ll need till the next new sticker is 682/680.
 This goes on and on till I have 681 stickers and the probability the next sticker will be a new sticker is 1/682. So the expected number of stickers I’ll need till the next new sticker is 682/1.
At that point I’ll have a complete collection. Adding together all these expected numbers of stickers gives
But each pack contains 5 stickers, so the expected number of packs I’ll need is