Football’s back


It’s coming home… Well, coming back technically. But either way the Premier League has now agreed to return to some sort of normal service on 17th June. Some leagues have already started – South Korea and Germany – with others set to follow shortly. For the time being at least, it’s starting to feel like business as usual.

So now feels like a good time to pause the blog again. Unless something happens to change my mind, this will be my final post on the Coronavirus epidemic. Thankfully, in many countries, the epidemic has been brought under partial control through social-distancing and other measures, so the debate has moved away from the science of epidemiology, in which Statistics is central, to socio-behavioural issues, in which Statistics is relevant, but less fundamental. Moreover, debate in the UK especially is increasingly focused on political aspects, and I don’t think it’s appropriate to use this forum to contribute to those discussions.

So, I’ll stop here. Thanks to those of you who have been following these posts, and especially to anyone who has commented either directly to the blog or indirectly to me personally.

By way of concluding things, I’ll just mention an article in today’s Financial Times, since it pulls together many of the themes I’ve tried to cover in these posts, while also summarising the current state of the epidemic in the UK.

  1. There’s been a lot of discussion about the fairness of comparing death rates across countries, which measure should be used, and whether numbers should be stated per capita. On balance, it’s believed that showing excess deaths per capita of population is fairest. This graph includes such a comparison: The UK fares worst by this measure, and is second worst only to the US and Peru by alternative measures.
  2. Italy – from where I am writing – also does poorly by this same measure. However, Italy was hit much sooner than the UK. The UK had an advantage of several weeks of both time and evidence to help avert the crisis.
  3. One of the factors for the UK’s troubled epidemic trajectory is the speed with which it ordered a nationwide lockdown. This graph shows a very strong relationship – though not necessarily causal – between the speed of lockdown and the number of excess deaths.  As an aside, there’s also confirmation today that the failure to stop major sporting events in the period immediately before the lockdown, in particular the Liverpool-Atletico return leg in the Champions league, is likely to have contributed to subsequent Coronavirus hotspots.
  4.  Another reason the UK has been so badly affected compared to other countries is the fact that cases were spread very early on in the epidemic across most of the country.  In Italy most cases were, and indeed still are, concentrated in Lombardia and surrounding regions. In Spain it was Madrid and, to a lesser extent, Barcelona. In France, Paris and another region to the east close to the Swiss-German border. Having localised outbreaks made containment measures much easier and more effective compared to the situation in the UK where most regions were affected. The following graph shows the number of excess deaths through time in each of the UK regions. There are some differences, but excluding Northern Ireland, no region ended up having more than twice as many excess deaths as any other.

5. I’ve referred several times in the blog to the statistician Professor David Spiegelhalter, who for a period was widely quoted by the government for apparently suggesting that cross-country comparisons are inappropriate. But that was a wilful misquote of what he actually said, and a more accurate summary of his feelings on the subject are included in the FT article:

If we can believe the data from other countries, then the UK has done badly in terms of excess deaths. The issues now concern what will happen for the rest of the year, and trying to understand the processes contributing to our large excess.

Finally, though I’m planning not to write any more on this topic, there are plenty of places to keep up-to-date with statistical matters relating to the epidemic, many of which I’ve referenced in earlier posts. In particular, there’s nobody better at explaining tricky statistical ideas in simple terms than David Spiegelhalter, whose blog is available here.

What a difference a week makes

As of today there have been more than 90,000 confirmed deaths due to Coronavirus in the USA, despite  But how important was the timing of the introduction of nationwide social distancing?

To examine this, Youyang Gu, who is a data scientist from New York, ran an epidemiological model – which incidentally is far more accurate than the US government’s own model – for the epidemic in the US, but under different assumptions about the timings of the introduction of social distancing.

These graphs summarise what happens if restrictions had been introduced a week earlier than actually happened:

And this is what happens if restrictions had been delayed by an additional week, compared to what actually occurred:

So if restrictions had been introduced a week earlier, there would currently have been around 35,000 deaths as of today – some 55,000 fewer than have actually occurred. But delaying things an extra week would have meant around 250,000 current deaths, an increase of 160,000 on the actual value.

Differences in the projections for the numbers of fatalities by August are even greater: in the two scenarios above, the prediction is for around 60,000 and 455,000 deaths respectively. These compare with a forecast of around 135,000 based on the true timings of restrictions.

There’s a well-known saying that a week is a long time in politics. This was never more true than in the midst of the current pandemic.

It’s fair to add some comments on the comparisons above:

  1. As stressed, these comparisons are inevitably based on model forecasts, not actual numbers, though the model used has proved accurate in tracking the trajectory of the epidemic so far.
  2. The comparison is based on US numbers, though the principle of the importance of the timing of response to the epidemic is equally valid elsewhere.
  3. The models assume that the restrictions that have been introduced will be maintained in the future. If social distancing is relaxed, it’s likely that the numbers will grow at a faster rate than predicted here.
  4. Restrictions by themselves can contain an epidemic, but they cannot make it go away. And, since contagion rates are reduced once more people are infected,  the more successful restrictions are in containing an epidemic, the more vulnerable the population is to further outbreaks once those restrictions are removed.
  5. It’s also well-understood that there are costs, both in terms of economics and non-Coronavirus fatalities, to maintaining strong social-distancing measures. A fair comparison should really include these additional costs.


R: should you care?

There was a joke in my post last week where Binky the amateur epidemiologist was giving a lesson on the meaning of R. The joke is: we’re all semi-experts now in R. We know it’s the average number an infected person will go on to infect. And we know that it’s important that it stays below 1: bigger than 1 and the epidemic will grow exponentially; smaller and it will fade away.

So, it’s a bit disconcerting that in Friday’s  press briefing it was revealed that the current estimate of R in the UK is dangerously close to the value of 1. And this is based data of infections that will have occurred before there was a loosening of the lockdown restrictions. Should we worry?

Not according to right-wing radio talk show host Julia Hartley-Brewer:

The article by Tom Chivers that Hartley-Brewer quotes is actually pretty interesting, and connects to a phenomenon in Statistics that was discussed in a very early post to this blog in pre-Coronavirus days. I’ll use the numerical example that Tom gives to illustrate things. It’s obviously a simplification of the real world, but it makes the point very effectively.

A particular issue with the Coronavirus epidemic around the world has been its devastation in care homes. Partly this is because it tends to hit older people hardest, and partly it’s because the nature of care homes makes contagion much harder to control. As such, the transmission rate is likely to be higher in care homes compared to the rest of the population.

So, suppose we have 1000 infected people in the wider population and 1000 infected people in care homes. Suppose also that the value of R is 2 in the population, but 3 in care homes. Then, on average, these groups of infected people will infect a further 2000 and 3000 people respectively. So, in total, we have 2000 infected individuals who will infect a further 5000 people and the overall value of R is 5000/2000 = 2.5.

Since this value is dangerously high, lockdown restrictions are introduced, both in care homes and outside. Let’s assume these have the effect of reducing the transmission rate in care homes to R=2.8, while the impact in the wider population is much greater, reducing R to 1.

Some time later it’s found there are 900 infected individuals in care homes and 100 outside. Because of the respective values of R, these individuals will then, on average, infect a further 900 x 2.8 = 2520 individuals in care homes, and 100 x 1 = 100 individuals outside. So, overall, we have 1000 infected individuals who will infect an average of 2620 further individuals and the overall value of R is 2.62.

And here’s the remarkable thing: the value of R has decreased both inside care homes and outside, but the overall value of R has increased.

This is an example of Simpson’s paradox which, as explained above, was discussed in a sporting context much earlier in this blog. The point is this: although R has gone down in both the separate communities, its value remains much higher in one compared to the other. And because of the relative numbers of infected individuals, the overall calculation of R is dominated by the care home numbers after the lockdown. Previously it was balanced between care home and general population numbers. The effect is for the overall value of R to move closer to the care home value after the lockdown, which is lower than the value before the lockdown, but higher than the original overall value.

So what does this mean in practice? Julia Hartley-Brewer’s interpretation is that although the evidence is that R has increased in the UK population, this might well be a consequence of Simpson’s paradox as above. It’s not, according to her, that social restrictions are ineffective; it’s that they are so effective outside of care homes that calculations of R are now dominated by the behaviour of transmission in care homes, which forces the value to be close to 1. And she boldly concludes:

… we don’t need to return to full lockdown.

But this misses the point completely. Although the overall value of R is less than 1, and its rise may well be due to the effect of Simpson’s paradox along the lines of the numerical example above, this very argument means that it’s likely that the value of R in care homes remains considerably greater than 1. This is extremely dangerous for 2 reasons. First, within care homes, transmission rates remain at levels that imply exponential growth. Left unchecked, this would be devastating for care home residents. Second, it’s impossible in practice to completely isolate care homes from the rest of the population. So, even though R is likely to be less than 1 in the wider community, its contact with another community for which R is greater than 1 is likely to stop the epidemic from simply dying out as would inevitably happen in a closed community with R less than 1.

The conclusion, therefore, is completely the opposite of what Hartley-Brewer implies: the fact that a vulnerable subset of the population are likely to have a value of R is greater than 1 adds weight to the arguments for being cautious about weakening lockdown restrictions. Not just for people in care homes, but also for right-wing talk show hosts living on the outside.

In the real world, of course, things are much more complicated than just two sub-populations with different transmission rates. Transmission rates are likely to vary geographically and by many other socio-demographic factors. The models on which policies are being developed allow for these multiple types of behaviour, and are therefore not ‘tricked’ by Simpson’s paradox. Discussions about the value of R are therefore unhelpfully simplistic. It might be the single best measure of the state of an epidemic’s trajectory, but in itself it’s not really sufficient to determine whether the epidemic is under control or not.

The politicisation of Statistics


My intention with these posts about Coronavirus has always been to show how Statistics can be used as part of a battery of scientific tools to learn about, understand and even fight the epidemic. I’m conscious though that a number of recent posts – for example, here, here, here, here, and here – have focused on the interplay between politics and Statistics in the UK response to the Coronavirus epidemic. This was unintentional, rather than planned.

As I’ve mentioned before, and many of you will have known already, I live in Italy, which was affected sooner than the UK by the current epidemic. I’ve therefore followed both the science and the government response to the crisis quite closely both here (Italy) and in the UK. There are many similarities, but also quite a few differences, both in the trajectory of the epidemic and in the way the governments have handled things. Without question, Italy has made many mistakes, though it also had less evidence and less time to make decisions. But as a statistician, what strikes me about the UK response is the extent to which Statistics has been used – and misused – as a cover for government action and inaction. If you read the posts linked above, you should get a sense of what I mean, though I also abandoned many other potential posts at the draft stage because I wanted to avoid this blog simply becoming a rant.

I can’t leave this issue without mentioning the latest abuse of Statistics by the UK government however. As you’ll know – discussed here – as part of a daily press briefing, the government included a slide comparing the trajectory of the virus in different countries. In my previous post, I already discussed the fact that some cosmetic changes had been introduced to that particular slide which had the effect of making the UK’s numbers seem less extreme compared to those of other European countries. But since then, the UK numbers have pretty much remained stable, while those of other countries have started to improve, meaning that the UK looks increasingly worse than other European countries. Consequently, as of this week, the UK government has dropped this particular slide from the daily briefings.

Now, you can make a perfectly valid argument – as indeed Professor David Spiegelhalter did – about the utility of detailed cross-country comparisons. And on the basis of that argument, you might reasonably decide that showing a graph that compares country numbers is misleading and choose not to do it. But what you can’t do, unless you are deliberately manipulating Statistics to best suit your purposes, is include the graph when it shows your country in a favourable light, but then stop showing it as soon as it doesn’t. That is a terrible use of Statistics, and arguably pretty poor government as well.

End. Of. Rant.

It’s not just me though:

Regional variations



A complication when looking at country-by-country numbers of Coronavirus cases or deaths is the effect of regional variations. Although China was badly affected at the start of the pandemic, it managed to pretty much contain the virus to the single region of Hubei, with the rest of China mostly unaffected. This clearly had little to do with the size of the different regions or other socio-geographic reasons, but was due to the travel and other restrictions that were imposed to limit the spread of the virus.

Similarly in Italy. The outbreak started in Lombardia, and spread to neighbouring regions before controls were put in place. The following graph compares the number of cases through time – up till 13th May – in the 5 worst affected regions of Italy, which happen to include Lombardia and its neighbours. As you can see, even among these 5 regions, there are very big differences. Lombardia has had roughly the same number of cases as the other 4 regions combined, and at 394 continues to have the highest  number of new cases (though Piemonte, which includes Torino, has a slightly higher rate of increase). Comparisons against the other 15 regions of Italy would show even bigger differences of course.

In one sense, having the majority of cases in a single location is a very big disadvantage: the pull on health resources in that location at the height of the epidemic will be enormous. And it’s pictures and stories from hospitals in Lombardia which were a wake-up call for many people about the dangers of this epidemic.

In another sense though, there’s a big advantage in localising the epidemic. If all cases are in one place, and you can stop transmission to other places, then the number of potential victims is much reduced. This is most evident from the China experience. China has a population of around 1.4 billion. But by effectively cutting off transmission from Hubei, with a population of around 60 million, to the rest of China, the risk to the majority of the country was almost eliminated. Had the same number of infected people been spread around China rather than concentrated in Hubei, the whole population would have been at much higher risk.

With all this in mind, I’m grateful to for sending me this link which allows a comparison of various aspects of UK Coronavirus data by either region or age. Here, for example, is a screenshot of the trajectory of the total number of cases in London and 3 other regions. The vertical blue line corresponds to the date at which the UK was placed under lockdown; the vertical red line is the date at which the graphs were produced (10th May).  The article itself includes other regions and the option to switch to other types of data – daily cases, death counts, etc.

There are some aspects of these figures that I don’t fully understand, which I’ll discuss below. But focus on the black curve, which is a smoothed value of the total number of cases on each date in each region. It’s a little higher in London and, to a lesser extent, the Midlands, but is of the same order of magnitude in each of the regions. Also, although the curve seems to have stabilised in those 2 regions, it seems to still be growing in the North East and North West, so it’s likely they’ll all end up with similar numbers. But the point is, there is no evidence of any local containment. Rather the virus seems widely spread at similar levels across the whole country.

As discussed above, this is good news in terms of health resources: demand is evenly spread across the whole country, avoiding the type of pressures that occurred in Lombardia. But the downside is that containment is likely to be much more difficult. Travel between regions in Italy is currently much more highly restricted than travel within regions, and you can see why this is an effective policy if the vast majority of cases are in Lombardia. In the UK, where the cases are much more widely dispersed, it’s clear that a strategy of this type would be of very little benefit.

On a rather more positive note, the above link also provides an estimate of the infected transmission rate – the ubiquitous R – through time for each region. Clearly, the lockdown has been effective in reducing R below the critical value of 1 in each of the regions.

The big question, of course, is how these values will change as the lockdown restrictions are gradually loosened. And a concern for the UK, because of the regional uniformity of numbers, is that similar changes as made in the countries will have a bigger impact on the value of R.

The report linked to above that contains the UK regional analysis was produced by the COVID-19 Working Group, which is part of the MRC Biostatistics Unit, which in turn is part of Cambridge University. It’s therefore likely to be high quality and reliable.

That said, there are some aspects of the graphs above that I find a bit puzzling. In both sets of graphs the shaded blue regions are so-called 95% credibility intervals. You can interpret these as meaning that the true value, of which the black curve is a kind of ‘best estimate’, will lie within the blue shaded region with probability 95%. Even more loosely: the black curve is the best estimate, but allowing for variations in sample data, the truth can only really be assumed to be anywhere within the blue shaded region.

What I don’t really understand with the first set of figures is why there is just as much uncertainty – as measured by the size of the blue shaded region – in the period going forward from the red line, as there was in the past where we have pretty much complete data. Usually, forecasts have much bigger credibility intervals because of the manifest unknown in how the future will behave.

That’s also an issue with the second set of figures on R, but there I have a more fundamental question. Is it really the case that you can identify from the data such a sudden change in R from above 3 to below 1 on exactly the day the lockdown was introduced? It will be a smart piece of Statistics if that’s really the case, but I’m doubtful. Empirical counts of new cases changed only very gradually around the date of the lockdown, so I find it surprising that a statistical model could pinpoint a big change so accurately.

The report includes mention of a ‘Methods Section’, but that seems to be missing, so I haven’t been able to investigate these aspects further.


Alert Level

This is a screenshot from the UK PM’s statement to the nation regarding a roadmap towards ending the current Coronavirus lockdown.

Taken literally the equation is clearly nonsense. As you’ll know, the value of R is somewhere around 1: values slightly smaller than 1 imply the epidemic is decaying, values greater than 1 imply it is growing exponentially. But, even the most pessimistic estimates of R before the lockdown were around 5. On the other hand, the number of current infections is in the several thousands, with large fluctuations from day to day. So, the inclusion of R in this equation is virtually irrelevant, and the Alert Level would oscillate wildly from day to day with the number of infected individuals.

Let’s assume instead that it’s intended that the number of infections is scaled – say by 1000. So, if R is 1 and the number of infections is 2,500, then the Alert Level would be 3.5. But still it doesn’t make much sense. Suppose you managed to eradicate transmission, so that R=0, but you still had 3000 infected in the population. Then the Alert Level would be 3, even though there would be no risk of further infection. Moreover, would an increase of 1 in the value of R be equally serious as an increase of 1000 in the number of infected individuals, as the equation implies? Generally, that would depend on the actual number of infected individuals: having 20,000 rather than 19,000 infected probably won’t alter the course of the epidemic very much, but having R=1.5 rather than R=0.5 most definitely would.

So, any literal interpretation of the slide, even allowing for scaling effects, is completely false. What is presumably intended is that decisions take on determining an Alert Level will be driven by two factors: the current estimated rate of transmission and the current number of infected individuals. By far the most important of these is the rate of transmission, since the nature of exponential growth is that just a few cases will become many thousands in a short period of time if R is bigger than 1. But the number of cases is relevant. Partly because it affects the number of new infected, especially in the short term; but more so because, if the number is sufficiently low, then a policy of containment through testing and contact tracing is feasible.

In summary: if you disregard any literal interpretation of the equation, but regard it as saying that two primary factors need to be considered when determining an appropriate Alert Level for COVID, then it makes some sort of sense. But presenting complex arguments in a way that makes them seem simpler is both patronising and counter-productive.

Polite request to PM

Of course we should now use other countries to try and learn why our numbers are high

  • David Spiegelhalter

Update: David Spiegelhalter also gave an interview yesterday on the Andrew Marr show:

It includes the line that the daily press briefings, which he describes as ‘completely embarrassing’, are:

… not a trustworthy communication of statistics.

The best and the worst of Statistics

The above graph is included in the following tweet sent by CEA which is the Council of Economic Advisors to the US White House.

The fluctuating black line shows the number of deaths due to Coronavirus per day in the US. The coloured dotted lines are model estimates and predictions produced at different time points. I’m not sure I need to make the points, but:

  1. The CEA claim that the mortality curves have “matched the data fairly well” is open to question.
  2. Accepting that a model fits well over a period of observed data is no real basis for assuming the model can be extrapolated into the future. The various model predictions here imply there will be zero new deaths in the US from a range of dates between 16th May and 4th August. All serious epidemiological models for the same process would describe such possibilities as somewhere between impossible and negligible.

To be fair to the authors of the IHME model – whose details are available here –  the detailed projections shown here as of 1st May do include measures of uncertainty as per the following graph:

Nonetheless, given that the current state is at best in a state of plateau, and at worst on an upward trajectory, it seems unduly optimistic that the trend will now be for a negative decline that’s almost as fast as the exponential growth in the early phase of the epidemic.

But let’s give this model, which is at least based on epidemiological assumptions, the benefit of any doubt. The CEA graph also includes a so-called “cubic fit” which is the one that leads to an estimate of zero deaths as of 16th May. There are no details as to how this has been obtained, but presumably someone has simply carried out a regression on the data of the black curve using a smooth curve (technically a 3-degree polynomial). But such a curve is bound to go negative at some point. Now if you look carefully at the cubic fit in the CEA graph, there’s a point where the curve changes from dashes to dots. My guess is that someone simply altered the cubic fit so as to stop it going negative. Unfortunately, zero deaths on 16th May is almost as improbable as negative deaths, so they might as well have not bothered.

Anyway, this epidemic has brought out the best and the worst of Statistics. I guess you can work out where the CEA analysis falls on this range.


  1. Thanks to for showing me the CEA tweet.
  2. This cartoon seems relevant right now:

A cover up

As the focus moves from handling the initial wave of the epidemic to what life after a lockdown should look like, at least in the short term, one important issue is whether the wearing of face masks is effective or not.

The following graph has been around on social media for a while now:

I’m pretty sure this it was never intended as a serious statistical analysis, but there is, admittedly, a perfect classification in the epidemic trajectory between countries that have and haven’t required the use of face masks for public use. But there are many other factors that may explain the differences – not least geography – and it’s obviously wrong to assume that the wearing of masks is the principal cause.

Still, it’s noticeable that in countries like South Korea, that managed to control the outbreak at an early stage, the use of face masks in public is considered almost mandatory. This point is explicitly made in a Royal Society report written to inform the UK Government’s scientific committee set up to handle the coronavirus epidemic. Their summary reads:

This evidence supports the conclusion that more widespread risk-based face mask adoption can help to control the Covid-19 epidemic by reducing the shedding of droplets into the environment from asymptomatic individuals. This is also consistent with the experiences of countries that have adopted this strategy.

In other words, the scientific evidence points to masks being helpful in reducing the spread of the virus from infected to uninfected individuals. And in addition to the scientific evidence, there is circumstantial evidence that countries that have widely adopted the use of face masks have been successful in containing the spread of the virus.

The report itself is essentially a review of the scientific literature – much of which is very recent – into the effectiveness of wearing face masks as a protection against infection. One of the articles included, published by the British Medical Journal, is especially interesting. It makes these points:

In the face of a pandemic the search for perfect evidence may be the enemy of good policy. As with parachutes for jumping out of aeroplanes, it is time to act without waiting for randomised controlled trial evidence.

That’s to say, in an ideal world, if you want to statistically test the efficacy of something, you would carry out a randomised trial in which you compare results on one group which is given the treatment, and a control group which is not. Essentially the A/B trial approach discussed here. But in the midst of an epidemic – just like jumping from an aeroplane – there’s no time to worry about whether parachutes are effective or not. The parachute almost certainly won’t do you any harm, so you may as well release it.

On this basis they conclude:

  • The precautionary principle states we should sometimes act without definitive evidence, just in case

  • Whether masks will reduce transmission of covid-19 in the general public is contested

  • Even limited protection could prevent some transmission of covid-19 and save lives

  • Because covid-19 is such a serious threat, wearing masks in public should be advised

Other literature covered in the Royal Society report – for example, this article –  provides more explicit evidence from statistical studies that the wearing of masks is effective in reducing virus transmission. Again, not in terms of preventing the wearer from becoming infected,  but in reducing the risk of the mask-wearer, who may be an asymptomatic carrier of the virus, from passing it on. That’s to say, the evidence points to mask-wearing as having a similar contribution to epidemic control as social distancing: it doesn’t eliminate anyone’s risk of catching the virus, but plays a part in reducing the overall transmission rate, and is therefore an important contributor to disease control.