Britain’s toughest quiz

A year ago that I wrote a post explaining that one of the traditions of the Royal Statistical Society is that every year around Christmas it publishes a quiz that is widely recognised to be one of the toughest out there. The questions are never strictly statistical or mathematical, but they do often require an ability to think laterally and logically, as well as a good general knowledge.

So, in case you’ve nothing better to do over Christmas, this year’s version of the quiz has just been published. Feel free to have a go and submit your answers; otherwise send me your answers and we can submit a team effort. (Teams of up to 5 people are allowed). Don’t worry if you struggle though: my net score prior to last year’s quiz was zero, a value that didn’t change following last year’s quiz.

As a guide to what type of thinking goes into the questions and solutions, here are links to last year’s quiz and solutions.

In any case, happy Christmas and hope you have a great holiday.

 

Santa Claus is coming to town

The substance of this post, including the terrible joke in the finale, is all stolen from here.

Look at this graph. The Santas represent points on the graph, and broadly show that the closer you get to Christmas, the more numerous the sightings of Santa. (Presumably in supermarkets and stores, rather than in grottos and sleighs, but you get the idea).

As discussed in previous posts – here, for example – we can measure the extent to which these two variables are related using the correlation coeffiecient. If the data lined up perfectly on an increasing straight line, the correlation would be 1. If the variables were completely unrelated, the correlation would be close to zero. (Unlikely to be exactly zero, due to random variation).

For the Santa data, the correlation is probably around 0.95. It’s not quite 1 for two reasons: first there’s a bit of noise around the general trend between the variables; second, the relationship itself looks slightly curved. But anyway, there’s a clear pattern to be observed: as Christmas approaches, the sightings of Santa increase. And this would manifest itself with a correlation coefficient close to 1.

What’s the effect of this relationship? Well, changing the time period before Christmas – say moving from a month before Christmas to a week before Christmas – will change the number of Santas you’re likely to see. But does it work the other way round? If we dressed a few extra people up as Santa, would it change the number of days left till Christmas? Clearly not. There’s a cause and effect between the two variables in the graphs, but it only works in one direction. The number of days left till Christmas affects the number of Santas you see on the street, but it simply doesn’t work the other way around.

Conclusion:

Correlation doesn’t imply Clausality!

Hohoho.


Footnote: the correct version of this phrase, ‘Correlation doesn’t imply Causality’, was the subject of an earlier post.

Just lucky

According to world chess champion Magnus Carlsen, the secret to his success is…

I’ve just been lucky.

Lucky? At chess?

Well, no, actually. This is Carlsen talking about his success at Fantasy Football. At the time of writing, Carlsen’s Premier League Fantasy Football team, Kjell Ankedal, is top of the League:

Top of the league sounds great, but this picture, which shows just the top 10 teams, is a little misleading. The Premier League Fantasy Football League actually has more than 6 million teams, and Kjell Ankedal is currently top of all of them. Moreover, Kjell Ankedal has finished in the top 5% of the league for the past 4 seasons, and in 2017-18 finished 2397th. Again, with 6 million teams the 2017-18 result would place Carlsen in the top  0.04%.

Obviously, football – and by corollary fantasy football –  is a game with many more sources of random intervention than chess, including the referee, the weather, VAR, the managers and just the inevitable chaos that can ensue from the physics of 22 people chasing, kicking and punching a ball. Compare that with the deterministic simplicity of a chess move such as e4.

And yet…

Can it be that Carlsen is ‘just lucky’ at Fantasy Football? Lucky to be top of the league after finishing in the top 5% or so, year after year? Well, we could make some assumptions about Carlsen actually being just an average player, and then work out the probability that he got the set of results he actually got, over this and recent seasons, if he was really just lucky rather than a very good player…

And it would be vanishingly small.

In his Ted Talk, Rasmus Ankersen says that the famous quote ‘The league table never lies’ should be replaced with ‘The league table always lies’. There’s simply too much randomness in football matches for a league table based on 38 matches or so per team to end up with a ranking of teams that reflects their exact ability. And yet, if you look at the top and bottom of most league tables there are very few surprises. League tables are noisy arrangements of teams ranked by their ability, but they are not just total chaos. Better teams generally do better than poorer teams, and teams are never champions or relegated just due to good or bad luck. So, to be in the top few percent of players, consistently over several seasons, with so many people playing is just implausible unless Carlsen is a much-better-than-average player.

So, while it’s true that Carlsen’s precise Fantasy Football ranking is affected to a greater extent by luck than is his world chess ranking, it’s probably a little disingenuous for him to say he’s just been lucky

And maybe it’s no coincidence that someone who’s eminently brilliant at chess turns out also to be eminently brilliant at fantasy football. Maybe one of the keys to Carlsen’s success at chess is an ability to optimise his strategy over the uncertainty in the moves his opponent will make.

Or maybe he’s just brilliant at everything he does.


Obviously, what applies to Carlsen with respect to Fantasy Football applies equally well to betting syndicates trading on football markets. Luck will play a large part in determining short term wins and losses, but in the very long term luck is ironed out, and what determines the success of the syndicate is their skill, judgement and strategy.

Friday the 13th

f13

Friday 13th. What could possibly go wrong today?

Well, according to people who suffer from Friggatriskaidekaphobia – the fear of Friday 13th – rather a lot. But is there any rationale for a fear of Friday 13th?

The scientific evidence is patchy. One study published in the British Medical Journal – ‘Is Friday the 13th bad for your health‘ – apparently found a 52% increase in hospital admissions from road accidents on Fridays that fell on the 13th of the month, compared with other Fridays.  However, one of the authors, Robert Luben, was subsequently quoted as saying:

It’s quite amusing and written with tongue firmly in cheek. It was written for the Christmas edition of the British Medical Journal, which usually carries fun or spoof articles.

I guess the authors looked at several possible statistics and reported the one that, by chance, fitted the hypothesis of Friday the 13th being unlucky. We’ve discussed this issue before: if you look at enough different phenomena where there is nothing of interest, some of them will look like there is something interesting happening just by chance. Statistics as a subject can be – and often is – badly misused this way,

Not everyone seemed to see it as a joke though. A follow-up study in the American Journal of Psychiatry titled ‘Traffic Deaths and Superstition on Friday the 13th‘  found a higher accident rate for women, but not men, on Fridays falling on the 13th of the month. This was subsequently contested by another group of researchers who published an article in the Journal BMC Public Health magazine titled ‘Females do not have more injury road accidents on Friday the 13th‘. Who to believe?

So, it’s a mixed bag. Moreover, as reported in Wikipedia – which gives an interesting history of the origins of the superstitions associated with Friday 13th – road accidents, in the Netherlands at least, are less frequent on Friday 13th, arguably because people take more care than usual. But even there I’d be cautious about the results without having a detailed look at the way the statistical analysis was carried out.

And anyway, Tuesday 8th is the new Friday 13th. You’ve been warned.


Footnote: I’m writing this on Thursday 12th, blissfully unaware of whatever horrors this particular Friday 13th will bring.

Cube-shaped poo

Do you like pizza? If so, I’ve got good and bad news for you.

The good news is that the 2019 Ig Noble prize winner in the category of medicine is Silvano Gallus, who received the award for…

… collecting evidence that pizza might protect against illness and death…

The bad news, for most of you, is that this applies…

…if the pizza is made and eaten in Italy.

Obviously, it’s a bit surprising that pizza can be considered a health food. But if you accept that, it’s also a bit surprising that it has to be Italian pizza. So, what’s going on?

The Ig Nobel prizes are a satirical version of the Nobel prizes. Here’s the Wikipedia description:

The Ig Nobel Prize (/ˌɪɡnˈbɛl/ IG-noh-BEL) is a satiric prize awarded annually since 1991 to celebrate ten unusual or trivial achievements in scientific research, its stated aim being to “honor achievements that first make people laugh, and then make them think.” The name of the award is a pun on the Nobel Prize, which it parodies, and the word ignoble.

As such, the prize is awarded for genuine scientific research, but for areas of research that are largely incidental to human progress and understanding of the universe. For example, this year’s prize in the field of physics went to a group of scientists for…

…studying how, and why, wombats make cube-shaped poo.

It’s in this context that Silvano Gallus won his award. But although the Ig Noble award says something about the irrelevance of the subject matter, it’s not intended as a criticism of the quality of the underlying research. Gallus’s work with various co-authors (all Italian) was published as an academic paper ‘Does Pizza Protect Against Cancer‘ in the International Journal of Cancer. This wouldn’t happen if the work didn’t have scientific merit.

Despite this, there are reasons to be cautious about the conclusions of the study. The research is based on a type of statistical experimental design known as a case-control study. This works as follows. Suppose, for argument’s sake, you’re interested in testing the effect of pizzas on the prevention of certain types of disease. You first identify a group of patients having the disease and ask them about their pizza-eating habits. You then also find a group of people who don’t have the disease and ask them about their pizza-eating habits. You then check whether the pizza habits are different in the two groups.

Actually, it’s a little more complicated than that. It might be that age or gender or something else is also different in the two groups, so you also need to correct for these effects as well. But the principle is essentially just to see whether the tendency to eat pizza is greater in the control group – if so, you conclude that pizza is beneficial for the prevention of the specified disease. And on this basis, for a number of different cancer-types, Silvano Gallus and his co-authors found the proportion of people eating pizzas occasionally or regularly to be higher in the control group than in the case group.

Case-control studies are widely used in medical and epidemiological studies because they are quick and easy to implement. The more rigorous ‘randomised control study’ would work as follows:

  1. You recruit a number of people for the study, none of whom have the disease of interest;
  2. You randomise them into two groups. One of the groups will be required to eat pizza on a regular basis; the other will not be allowed to eat pizza;
  3. You follow the 2 groups over a number of years and identify whether the rate of disease turns out to be lower in the pizza-eating group rather than the non-pizza-eating group;
  4. Again, you may want to correct for other differences in the 2 groups (though the need for this is largely eliminated by the randomisation process).

Clearly, for both logistic and time reasons, a randomised control study is completely unrealistic for studying the effects of pizza on disease prevention. However, in terms of reliability of results, case control studies are generally inferior to randomised control studies because of the potential for bias.

In case control studies the selection of the control group is extremely important, and it might be very easy to fall into the trap of inadvertently selecting people with an unusually high rate of eating pizzas. (If, for example, you surveyed people while standing outside a pizzeria). It’s also easy – by accident or design – for the researcher to get the answer they might want when asking a question. For example: “you eat a lot of pizza, don’t you?” might get a different response from “would you describe yourself as a regular pizza eater?”. Moreover, people simply might not have an accurate interpretation of their long-term eating habits. But most importantly, you are asking people with, for example, cancer of the colon whether they are regular pizza eaters. Quite plausibly this type of disease has quite a big effect on diet, and one can well imagine that pizzas are not advised by doctors. So although the pizza-eating question is probably intended to relate to the period prior to getting the disease, it’s possible that people with the disease are no longer tending to eat pizza, and respond accordingly.

Finally, even if biases are eliminated by careful execution of the study, there’s the possibility that the result is anyway misleading. It may be that although pizzas seem to give disease protection, it’s not the pizza itself that’s providing the protection, but something else that is associated with pizza eating. For example, regular pizza eating might just be an indicator of someone who simply has regular meals, which may be the genuine source of disease protection. There’s also the possibility that while the rates of pizza eating are lower among the individuals with the specified diseases, they are much higher among individuals with other diseases (heart problems, for example). This could have been identified in a randomised control study, but flies completely under the radar in a case-control study.

So, case-control studies are a bit of a minefield, with various potential sources of misleading results, and I would remain cautious about the life-saving effects of eating pizza.

And finally… like all statistical analysis, any conclusions made on the basis of sample results are only relevant to the wider population from which that sample was drawn. And since this study was based on Italians eating Italian pizzas, the authors conclude…

Extension of the apparently favorable effect of pizza on cancer risk in Italy to other types of diets and populations is therefore not warranted.

So, fill your boots at Domino’s Pizzas, but don’t rely on the fact that this will do much in the way of disease prevention.

 

 

Monkey business

Stick a monkey on a typewriter, let him hit keys all day, and what will you get? Gibberish, probably. But what if you’re prepared to wait longer than a day? Much longer than a day. Infinitely long, say. In that case, the monkey will produce the complete works of Shakespeare. And indeed any and every other work of literature that’s ever been written.

This is from Wikipedia:

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, such as the complete works of William Shakespeare.

Infinity is a tricky but important concept in mathematics generally. We saw the appearance of infinity in a recent post, where we looked at the infinite sequence of numbers

1, 1/2, 1/4, 1/8,….

and asked what their sum would be. And it turned out to be 2. In practice, you can never really add infinitely many numbers, but you can add more and more terms in the sequence, and the more you add the closer you will get to 2. Moreover, you can get as close to 2 as you like by adding sufficiently many terms in the sequence. It’s in this sense that the sum of the infinite sequence is 2.

In Statistics the concept of infinity and infinite sums is equally important, as we’ll discuss in a future post. But meantime… the infinite monkey theorem. What this basically says is that if something can happen in an experiment, and you repeat that experiment often enough, then eventually it will happen.

Sort of. There’s still a possibility that it won’t – the monkey could, by chance, just keep hitting the letter ‘a’ totally at random forever, for example – but that possibility has zero probability. That’s the ‘almost surely’ bit in the Wikipedia definition. On the other hand, with probability 1 – which is to say complete certainty – the monkey will eventually produce the complete works of Shakespeare.

Let’s look at the calculations, which are very similar to those in another recent post.

There are roughly 50 keys on a keyboard, so assuming the monkey is just hitting keys at random, the probability that the first key stroke matches the first letter of Shakespeare’s works is 1/50. Similarly, the probability the second letter matches is also 1/50. So to get the first two matching it’s

1/50 \times 1/50

Our monkey keeps hitting keys and at each new key stroke, the probability that the match-up continues is multiplied by 1/50. This probability gets small very, very quickly. But it never gets to zero.

Now, if the monkey has to hit N keys to have produced a text as long as the works of Shakespeare, by this argument he’ll get a perfect match with probability

p=(1/50)^N

This will be a phenomenally small number. Virtually zero. But, crucially, not zero. Because if our tireless monkey repeats that exercise a large number of times, let’s say M times, then the probability he’ll produces Shakespeare’s works at least once is

Q =   1-(1-p)^M

And since p is bigger than zero – albeit only slightly bigger than zero –  then Q gets bigger with N. And just as the sum of the numbers 1, 1/2, 1/4, … gets closer and closer to 2 as the number of terms increases, so Q can be made as close to 1 as we like by choosing M large enough.

Loosely speaking, when M is infinity, the probability is 1. And even more loosely: given an infinite amount of time our monkey is bound to produce the complete works of Shakespeare.


Obviously, both the monkey and the works of Shakespeare are just metaphors, and the idea has been expressed in many different forms in popular culture.  Here’s Eminem’s take on it, for example:

 

The China syndrome

In a couple of earlier posts I’ve mentioned how statistical analyses have sometimes been used to demonstrate that results in published analyses are ‘too good to be true’. One of these cases concerned Mendel’s laws of genetic inheritance. Though the laws have subsequently been shown to be unquestionably true, Mendel’s results on pea experiments were insufficiently random to be credible. The evidence strongly suggests that Mendel tweaked his results to fit the laws he believed to be true. He just didn’t understand enough about statistics to realise that the very laws he wanted to establish also implied sizeable random variation around predicted results, and the values he reported were much too close to the predicted values to be plausible.

As discussed in a recent academic article, a similar issue has been discovered in respect of official Chinese figures for organ donation. China has recently come under increasing international pressure to discontinue its practice of using organs of dead prisoners for transplants. One issue was consent – did prisoners consent to the use of their organs before their death? But a more serious issue was with respect to possible corruption and even the possibility that  some prisoners were executed specifically to make their organs available.

Anyway, since 2010 China has made efforts to discontinue this practice, replacing it with a national system of voluntary organ donation. Moreover, they announced that from 2015 onwards only hospital-based voluntary organ donations would be used for transplants.  And as evidence of the success of this program, two widely available datasets published respectively by the China Organ Transplant Response System (COTRS)  and the Red Cross Society of China, show rapid growth in the numbers of voluntary organ donations, which would more than compensate for the cessation of the practice of donations from prisoners.

Some of the yearly data counts from the COTRS database are shown in this figure taken from the report references above. The actual data are shown by points (or triangles and crosses); the curves have been artificially added to show the general trend in the observed data. Clearly, for each of the count types, one can observe a rapid growth rate in the number of donations.

But… here’s the thing… look at how closely the smooth curves approximate the data values. The fit is almost perfect for each of the curves. And there’s a similar phenomenon for other data, including the Red Cross data. But when similar relationships are looked at for data from other countries, something different happens: the trend is generally upwards, as in this figure, but the data are much more variable around the trend curve.

In summary, it seems much more likely that the curves have been chosen, and the data chosen subsequently to fit very closely to the curves. But just like Mendel’s pea data, this has been done without a proper awareness that nature is bound to lead to substantial variations around an underlying law. However, unlike Mendel, who presumably just invented numbers to take shortcuts to establish a law that was true, the suspicion remains that neither the data nor the law are valid in the case of the Chinese organ donation numbers.


A small technical point for those of you that might be interested in such things. The quadratic curves in the above plot were fitted in the report by the method of simple least squares, which aims to find the quadratic curve which minimises the overall distance between the points and the curve. As a point of principle, I’d argue this is not very sensible. When the counts are bigger, one would expect to get more variation, so we’d probably want to downweight the value of the variation for large counts, and increase it for the lower counts. In other words, we’d expect the curve to fit better in the early years and worse in the later years, and we should take that into account when fitting the curve. In practice, the variations around the curves are so small, the results obtained by doing things this way are likely to be almost identical. So, it’s just a point of principle more than anything else. But still, in an academic paper which purports to use the best available statistics to discredit the claim made by a national government, it would probably be best to make sure you really are using the most appropriate statistical methods for the analysis.

Black Friday

Had you heard today is Black Friday. Or have you been living as a hermit in a cave without phone or access to emails for the last couple of weeks or so?

Like Cyber Monday, Green Monday and Giving Tuesday, Black Friday is a retail event imported from the United States, where it is timed to coincide with the Thanksgiving national holiday period. Sadly, here in the UK, we don’t get the holiday, but we do get the pleasure of a day – which often extends to at least a couple of weeks – indulging ourselves with the luxury of purchasing goods that we probably don’t need at prices that are well below the usual retail price.

Or do we?

The consumer group Which monitored the prices of 83 products that were offered for sale during 2018’s Black Friday event and found:

  • 95% of the products were available at the same price or cheaper in the 6 months following Black Friday;
  • 61% of the products had been available at the same price or cheaper in the 6 months prior to Black Friday;
  • Just 5% of the products were genuinely at their cheapest on Black Friday compared to the surrounding 12-month period.

Obviously 83 products is not a huge sample size, especially since different retailers are likely to have a different pricing strategy, so you shouldn’t read too much into the exact numbers. But the message is clear and probably ties in with your own experience of  the way retailers manipulate shoppers’ expectations during ‘sales’.

Anyway, a fun statistical analysis of various aspects of Black Friday can be found here. I’m not sure how reliable any of the analyses are, especially in light of the Which results, but an example is given in the following figure. This shows – apparently – the sales growth per country on Black Friday compared to a regular Friday.

Now, I don’t know if it’s the number of items sold, the money spent, or something else, but in any case Pakistan supposedly has a retail rate that’s 11525% of a normal Friday rate. That’s to say a sales increase factor of 115. In Italy the factor is 45 and even in the UK the usual Friday rate is multiplied by 15. Impressive if true.

But I’m personally more impressed by Thailand who doggedly spend less than half of a normal Friday’s expenditure on Black Friday. Of course, we can’t tell from these data whether this is due to a genuine resistance to Black Friday, or whether Thailand has a strong seasonal variation in sales such that this time of the year is naturally a period of low sales.


Finally, if you want to empathise with Thailand, you could yourself participate in Buy Nothing Day, intentionally held on the same day as Black Friday. It probably doesn’t need much in the way of explanation, but just in case, here’s the tagline from the webpage:

Lock up your wallets and purses, cut up your credit cards and dump the love of your life – shopping!

It’s time to celebrate Buy Nothing Day!

Maybe someone should pass the message on to Pakistan.

At The Intersection

You’ll remember Venn diagrams from school. They’re essentially a mathematical tool for laying out the information in partially overlapping sets. And in statistics they are often used in the same way for showing the possible outcomes in events which might overlap.

For example, here’s a Venn diagram showing the relationship between whales and fish:

Whales and fish have some properties that are unique, but they also have some features in common. These are all shown in the appropriate parts of the diagram, with the common elements falling in the part of the sets that overlap – the so-called intersection.

With this in mind, I recently came across the following Venn poem titled ‘At the Intersection’ written by Brian Bilston:

You can probably work it out. There are three poems in total:  separate ones for ‘him’ and ‘her’ and their intersection. Life seen from two different perspectives, the result of which is contained in the intersection.

Genius.

 

One-in-a-million

Suppose you can play on either of 2 slot machines:

  1. Slot machine A pays out with probability one in a million.
  2. Slot machine B pays out with probability one in 10.

Are you more likely to get a payout with one million attempts with slot machine A or with 10 attempts on slot machine B?

Have a think about this before scrolling down.

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

I was prompted to think about this question by the following tweet, which includes both the answer and the relevant calculations.

So, there’s a bigger probability (0.65) that you’ll get a payout from 10 spins of slot machine B than from a million spins of slot machine A (probability 0.63).

Hopefully, the calculations above are self-explanatory. But just in case, here’s the detail. Suppose you have N attempts to win with a slot machine that pays out with probability 1/N.

  1. First we’ll calculate the probability of zero payouts in the N spins.

2. This means we get a zero payout on every spin.

3. The probability of a zero payout on one spin is one minus the probability of a win: 1 – 1/N.

4. So the probability of no payout on all the spins is

  (1-1/N)^N

  5. And the probability of at least one payout is

1- (1-1/N)^N

As explained in the tweet, with N=10 this gives 0.65 and with N=1,000,000 it gives 0.63. The tweet’s author explains in a follow-up tweet that he was expecting the same answer both ways.

But as someone in the discussion pointed out, that logic can’t be right. Suppose you had one attempt with slot machine C which paid out with probability 1. In other words, N=1 in my example above. Then, of course, you’d be bound to get a payout, so the probability of at least one payout is 1. So, although it’s initially perhaps surprising that you’re more likely to get a payout with 10 shots at slot machine B than with a million shots at slot machine A, the dependence on N becomes obvious when you look at the extreme case of slot machine C.


Footnote: What does stay the same in each case however is the average number of times you will win. With N shots at a slot machine with win probability 1/N, you will win on average once for any choice of N. Sometimes you’ll win more often, and sometimes you may not win at all (except when N=1). But the average number of wins if you play many times will always be 1.