# R: should you care?

There was a joke in my post last week where Binky the amateur epidemiologist was giving a lesson on the meaning of R. The joke is: we’re all semi-experts now in R. We know it’s the average number an infected person will go on to infect. And we know that it’s important that it stays below 1: bigger than 1 and the epidemic will grow exponentially; smaller and it will fade away.

So, it’s a bit disconcerting that in Friday’s  press briefing it was revealed that the current estimate of R in the UK is dangerously close to the value of 1. And this is based data of infections that will have occurred before there was a loosening of the lockdown restrictions. Should we worry?

Not according to right-wing radio talk show host Julia Hartley-Brewer:

The article by Tom Chivers that Hartley-Brewer quotes is actually pretty interesting, and connects to a phenomenon in Statistics that was discussed in a very early post to this blog in pre-Coronavirus days. I’ll use the numerical example that Tom gives to illustrate things. It’s obviously a simplification of the real world, but it makes the point very effectively.

A particular issue with the Coronavirus epidemic around the world has been its devastation in care homes. Partly this is because it tends to hit older people hardest, and partly it’s because the nature of care homes makes contagion much harder to control. As such, the transmission rate is likely to be higher in care homes compared to the rest of the population.

So, suppose we have 1000 infected people in the wider population and 1000 infected people in care homes. Suppose also that the value of R is 2 in the population, but 3 in care homes. Then, on average, these groups of infected people will infect a further 2000 and 3000 people respectively. So, in total, we have 2000 infected individuals who will infect a further 5000 people and the overall value of R is 5000/2000 = 2.5.

Since this value is dangerously high, lockdown restrictions are introduced, both in care homes and outside. Let’s assume these have the effect of reducing the transmission rate in care homes to R=2.8, while the impact in the wider population is much greater, reducing R to 1.

Some time later it’s found there are 900 infected individuals in care homes and 100 outside. Because of the respective values of R, these individuals will then, on average, infect a further 900 x 2.8 = 2520 individuals in care homes, and 100 x 1 = 100 individuals outside. So, overall, we have 1000 infected individuals who will infect an average of 2620 further individuals and the overall value of R is 2.62.

And here’s the remarkable thing: the value of R has decreased both inside care homes and outside, but the overall value of R has increased.

This is an example of Simpson’s paradox which, as explained above, was discussed in a sporting context much earlier in this blog. The point is this: although R has gone down in both the separate communities, its value remains much higher in one compared to the other. And because of the relative numbers of infected individuals, the overall calculation of R is dominated by the care home numbers after the lockdown. Previously it was balanced between care home and general population numbers. The effect is for the overall value of R to move closer to the care home value after the lockdown, which is lower than the value before the lockdown, but higher than the original overall value.

So what does this mean in practice? Julia Hartley-Brewer’s interpretation is that although the evidence is that R has increased in the UK population, this might well be a consequence of Simpson’s paradox as above. It’s not, according to her, that social restrictions are ineffective; it’s that they are so effective outside of care homes that calculations of R are now dominated by the behaviour of transmission in care homes, which forces the value to be close to 1. And she boldly concludes:

But this misses the point completely. Although the overall value of R is less than 1, and its rise may well be due to the effect of Simpson’s paradox along the lines of the numerical example above, this very argument means that it’s likely that the value of R in care homes remains considerably greater than 1. This is extremely dangerous for 2 reasons. First, within care homes, transmission rates remain at levels that imply exponential growth. Left unchecked, this would be devastating for care home residents. Second, it’s impossible in practice to completely isolate care homes from the rest of the population. So, even though R is likely to be less than 1 in the wider community, its contact with another community for which R is greater than 1 is likely to stop the epidemic from simply dying out as would inevitably happen in a closed community with R less than 1.

The conclusion, therefore, is completely the opposite of what Hartley-Brewer implies: the fact that a vulnerable subset of the population are likely to have a value of R is greater than 1 adds weight to the arguments for being cautious about weakening lockdown restrictions. Not just for people in care homes, but also for right-wing talk show hosts living on the outside.

In the real world, of course, things are much more complicated than just two sub-populations with different transmission rates. Transmission rates are likely to vary geographically and by many other socio-demographic factors. The models on which policies are being developed allow for these multiple types of behaviour, and are therefore not ‘tricked’ by Simpson’s paradox. Discussions about the value of R are therefore unhelpfully simplistic. It might be the single best measure of the state of an epidemic’s trajectory, but in itself it’s not really sufficient to determine whether the epidemic is under control or not.