Seasonal effects

There’s been plenty of speculation (here, here, here,…) that the novel Coronavirus might be seasonal, meaning that transmission rates will reduce significantly in the warmer summer months in temperate countries. This would help significantly in controlling the current epidemic wave, potentially buying considerable time in allowing vaccine development or other exit strategies from current lockdown conditions.  But so far there’s been little direct evidence that the Coronavirus is genuinely seasonal.

However, the following tweet links to a statistical analysis which, though circumstantial, provides reason to believe in a seasonal effect. The author of the study looked per-capita death rates due to COVID-19 in individual counties of the United States. They then fitted a regression model using demographic and climate-based statistics as potential explanatory variables for differences in county-to-county rates. What emerged is that temperature is the most significant factor. That’s to say, after allowance for other explanatory factors, the one that had the most impact was temperature: in counties with higher average temperature, everything else being equal, the per-capita death rated to COVID-19 was lower.

Of course, there are all sorts of caveats – see discussion here –  about extrapolating from the conclusions of the type here to assuming seasonality in the worldwide transmission behaviour of the virus. But it is, at least, another reason to be cheerful optimistic.

Risky talk

In a previous post I referenced a book by the eminent statistician David Spiegelhalter. Since earlier this year, Davis has also been producing a podcast ‘Risky Talk‘ on the relevance of Statistics for various issues of public interest. The latest of these is titled ‘Coronavirus: Understanding the Numbers’ and is full of useful information and discussion. It includes, among other things, a discussion of:

  1. Which data are most reliable for understanding the epidemic;
  2. How the different approaches to the epidemic adopted in Norway and Sweden provide a live experiment for assessing the impact of social controls;
  3. A comparison of the seriousness of COVID-19 relative to other flu-like illnesses in the UK;
  4. A discussion of the personal risk we all carry of dying from COVID-19 and other causes.

It’s a great listen and there’s probably nobody qualified to be explaining these issues


The graph above is the latest (as of 5th April) update from the FT showing a 7-day rolling average of the number of new COVID-19 confirmed cases through time for a number of countries. The point of using a 7-day rolling average – which means each value is the average of the previous 7-days’ values – is to reduce the effect of randomness in day-to-day variations, so as to get a smoother picture of trends. As discussed in previous posts, it’s possibly misleading to use confirmed cases as a strict measure of the epidemic scale, since the number of confirmed cases will depend in part on the protocol for testing, which varies from country to country, and even within each country through time. Nonetheless, it’s likely to be broadly interpretable as an indicator of epidemic strength.

Notwithstanding this issue, if the epidemic were growing exponentially in any country, the graph would show as a straight line on this logarithmic scale. To a greater or lesser extent, the curves for almost all countries show a tendency to flatten through time, especially from the time that social measures have been applied to limit potential transmissions through contact. The curve for the UK remains stubbornly close to linear, but its lockdown was introduced later – in relative terms – than for most other European countries. The curve for Italy seems to have flattened quicker than for other countries – again relative to when the country was placed on lockdown – but that’s probably because severe local restrictions were placed on the worst-affected regions some time before the entire country was placed on lockdown.

But anyway…. the point I wanted to make in this post is a little different. There are several reasons why it’s a good idea to use a logarithmic scale in graphs like the one above. Mostly this is because there are good epidemiological reasons to believe – as discussed here –  that an unchecked epidemic will grow exponentially. And exponential growth on a logarithmic scale will appear linear, which makes comparisons and contrasts much easier. But one disadvantage of the logarithmic scale in this context is that it can give a false impression as to the degree of similarity between countries. Looking at the above graph, it’s true that the trajectory for the United States looks currently worse than that for other countries, but not so much worse. But now look at the same graph, from a day or two earlier, on a linear, instead of logarithmic, scale:

On this scale the difference in trajectory for the United States relative to each of the other countries is much more apparent. The current level is very much greater, while the tendency for growth is also considerably more dramatic.

In summary, different scales for graphs are useful for different purposes. And though the logarithmic scale is better than a linear scale for most purposes in tracking an epidemic, it’s only once you put things back on a linear scale that you get a true sense of how different the epidemic currently is on the ground in different countries.


A changing world

In an earlier post, I discussed the ‘stringency index’, which has recently been developed as a way of measuring how severe – stringent – a country’s response has been to the Coronavirus epidemic.

The Financial Times, as part of its live coronavirus coverage, has now produced the following animated world map of the stringency index from the start of the year up to 24 March:

It’s striking how most of the world outside of China stays blue for most of February – arguably time thrown away – and how rapidly most of the world turns red and purple from the middle of March.

As an aside, the tweet below contains a great video where John Burn-Murdoch of the FT explains several of the decisions made by his team in the way they have chosen to present graphs showing the scale of the epidemic across countries:

Sex and the Coronavirus

Actually, not in that sense, but you can find relevant information here.

For good and for bad, the Coronavirus epidemic is generating a large amount of data. And as more data become available, Statistics plays its part in understanding the virus in terms of its mechanisms of transmission and spread.

One very obvious aspect of the original Chinese data – described in an academic paper in the Lancet – which has subsequently been confirmed as data from other countries became available, is a difference in death rates for infected males and females. The rate of contagion for males and females is broadly similar, as shown in the following diagram

The slight difference in rate of infection between the sexes has also been subsequently observed in other countries – males always having a slightly higher infection rate – so although the difference is slight, it’s likely to be a genuine phenomenon rather than a random effect due to small amounts of data.

But in any case, this difference in infection rates pales into comparison when comparing death rates for males and females. In the original Lancet paper the ratio of male to female Coronavirus deaths is reported as 73% : 27%. So if you’re a male, does this place you in a higher risk category?

Not necessarily. In pre-coronavirus days, various posts in this blog – for example here – discussed the way that an apparent effect, such as death rates varying according to an individual’s sex, could actually be due to an entirely different phenomenon. In particular, smoking rates among men in China are very much higher than those of women. And since almost all deaths due to Coronavirus occur via failures of the respiratory system, it was hypothesised that the increased death rates among men was actually a consequence of smokers being at higher risk.

Unfortunately for men – though not for smokers – this hypothesis has been found to be unsupported by data from other countries. Based on the latest available data from all countries, the death rates for males and females who contract COVID-19 are given by the following table:

The fatality rates are different depending on whether you look at confirmed or unconfirmed cases, but in each case the ratio of fatalities of males to females is around 62% : 38%. This is a less extreme ratio than was found from the Chinese data, but since this now includes data from countries where the difference in smoking rates between males and females is much smaller than for China, it implies that smoking is not the only issue. It might explain why the ratio is worse for China than elsewhere, but it can’t be the whole story.

This New York Times article based on the Italian data points out that previous coronavirus epidemics such as SARS and MERS also led to higher fatality rates among males, and argues this is likely to be due to women having generally stronger immune defence systems due to genetics.


  1. Various newspaper articles have discussed this phenomenon: here, for a discussion of the Italian data; here for a discussion of the Spanish data.
  2. The Lancet paper referred to above was published on 29 January. It concluded

We have to be aware of the challenge and concerns brought by 2019-nCoV to our community. Every effort should be given to understand and control the disease, and the time to act is now.

Of course, it’s easy to be wise after the event. But the Lancet paper was wise before any of the events outside of China had taken place.




Stay strong, stay at home

This is a quick follow-up to yesterday’s post, ‘Reasons to be cheerful’. I suggested there that looking at the same data in different ways can give you an alternative perspective on things. Specifically, I showed how looking at the rate of change in the number of new Coronavirus cases leads to a more optimistic view of how the epidemic is being brought under control, compared to just looking at the cases.

The following graph is like that of the previous post, but now showing the number of deaths due to Coronavirus through time in the worst affected countries.

Again, for most countries, there is some slight flattening of the curves, but if you live in a country like Italy, it’s difficult to see much encouragement that things are actually improving, despite the country now having been in total lockdown for 3 weeks.

But, in a series of very helpful tweets, Julia Steinberger who is professor in social ecology and ecological economics at the University of Leeds presented the data in different way, shedding a different light on things. Her graph, shown below, plots the current doubling rate of new Coronavirus deaths  against the total number of deaths. The doubling rate is the number of days it will take the number of deaths to double if the current rate of deaths is maintained. So, the higher the value, the better the epidemic is being contained.

Looked at this way:

  • Improvements in Spain, Italy and especially China, where social restrictions have been in place longest, are evident.
  • In the US, where the potential scale of the epidemic was initially underestimated, the doubling rate decreased for some time, and has only recently started to climb.
  • In the UK, after initially climbing, the doubling rate has actually been declining, though the number of deaths in the last couple of days since the graph was produced have been lower, so the doubling rate has actually increased in recent days. Based on today’s numbers, the current doubling rate is around 4.9 days for the UK.

Updating to the most recent numbers for other countries as well, we find the current doubling rate for the US is around 4.6 days, for Spain it’s 5.4 days and for Italy it’s 9.53 days. In other words, it’s improving everywhere.

Admittedly, the picture is a bit more noisy than that of the previous post,  partly because there are fewer deaths than cases, and also because these are daily values rather than weekly averages. But in any case the message is clear, especially once numbers are updated using the most recent data: social restrictions are working and numbers are improving, even if it’s difficult to see from the original plot. Re-interpreting the numbers in terms of doubling rates gives a much more optimistic picture of how the epidemic is being brought under control.

In summary: stay strong, stay at home. It does work.

It’s probably best to be a little cautious when interpreting the recent improvement in the UK numbers. Legally binding social restrictions have only been in place for a week, which is too short a time for effects to show up in the numbers of fatalities. So, whatever improvements there have been in numbers in the last couple of days is not due to government restrictions. It’s possible, however, that people’s behaviour patters had changed in advance of the formal government restrictions being announced, and this is what’s driving the improvement in numbers. It’s also possible, however, that the improvement is due to a combination of noise and changes in the way the data are being collated. We’ll get a clearer picture in the next few days once more data become available.

Reasons to be cheerful…

Ok, not cheerful exactly, but optimistic.

Often, looking at the same data in a different way can give a completely different perspective on things. The following graph is the updated number of reported Coronavirus cases country-by-country through time.

A few comments:

  • The graph for each country is shifted so that time is measured from the first date on which 1000 cases were reported in that country. In this way the graph for each country is starting at roughly the same level.
  • The graph is on a logarithmic scale, meaning that exponential growth as discussed in earlier posts, would show up as a straight line on this graph.
  • Almost all countries display exponential growth at the start of the epidemic followed by a flattening, Both the rate of exponential growth and tendency to flatten varies from country to country.
  • Despite the lockdowns and other restrictions imposed in many countries in recent weeks, it’s hard to convince yourself that there’s been any noticeable improvement.

And yet… based on the same data – albeit half a day later or so – the following graph shows the percentage increase in new cases – averaged over the previous week to minimise the effect of random day-to-day changes.

For almost all of the countries, the daily percentage increase in cases has fallen and is continuing to fall. In Italy, for example, the daily increase has gone down from around 19% to 8% in the space of a couple of weeks. The trend in the UK is also downwards, but by a smaller amount. However, enforced social controls have only been in place in the UK for less than a week.

One slight caveat is that the information from these graphs is limited to confirmed cases. This means that:

  1. The numbers themselves are bound to be an underestimate of the number of infected individuals in a country;
  2. Comparisons between countries are complicated by the fact that some countries are testing many more individuals than others;
  3. And the trajectory for each country is also complicated by possible changes in testing protocols as the epidemic has evolved.

Nonetheless, the overall trends in these graphs are likely to be broadly indicative of a slowing of the epidemic in almost all countries. The picture for the US is especially complicated however due to wide scale state-by-state differences in testing protocols, that are also changing rapidly in time.

Of course, what we’re seeing in terms of changes in growth rate is also present in the graph above on case numbers. The almost linear reduction in growth rates is due to the slight flattening of the curves of the case numbers on a logarithmic scale. It’s simply that looking at the data this way, the daily changes are highlighted and we get a more realistic – even optimistic – picture of how, despite daily numbers of cases that seem persistently high, the projection for a couple of weeks time is that the rate of new cases will be totally manageable.

So, be optimistic, cheerful even, that the social restrictions are having an effect on the epidemic growth and that there is light at the end of the tunnel.

And if you need them,  here are many more reasons to be cheerful, curated by David Byrne, no less.


The World Health Organisation officially declared the current Coronavirus outbreak a pandemic on 12 March.  A pandemic is technically defined as:

… new disease for which people do not have immunity spreads around the world beyond expectations…

though this is largely subjective, which is why the declaration for the current outbreak was not made till 12 March. But even before that date, most countries realised the Coronavirus epidemic was already on their doorsteps and needed some kind of response.

But how rapid and how stringent have different countries been in their responses?

This is the subject of a new tracker which monitors how different governments have responded to the crisis according to the number of cases they presently have in their country. Specifically, they define something called a stringency index which records, on a scale of 0 to 100, how stringent a country’s measures are. Full details of the definition of the stringency index and the methodology used are available here. Broadly speaking, the more restrictive and widespread a country’s measures, the greater the value of the index. However, the index does not measure how effective the measures are, nor how strictly they are applied or followed.

The tracker is live, which means it is regularly updated. However, as of 24 March, a summary of the way 6 different countries have responded to the crisis is contained in the following figure:

For each country, time is measured in days since the first case appeared in that country, and the black curve shows the trajectory of the epidemic in terms of number of cases. (Bear in mind though that the number of cases is also related to the number of tests carried out, so direct comparison of these curves across countries may not be entirely valid).

The red dots show the value of the stringency index on the same timescale. You need to look at the right-hand axis to read-off the actual values of the index. For all countries the stringency index has generally risen as the epidemic has grown: countries have responded to the crisis by bringing in measures to control the virus spread. But there are significant differences across the different countries:

  • In France and especially Italy, the stringency index follows the trajectory of the epidemic very closely. In other words, governments there have responded quickly to the scale of the epidemic as it has grown.
  • In South Korea, where the epidemic has been largely controlled, the stringency measure values increase ahead of the growth of the epidemic. That’s to say, the government has anticipated the growth of the epidemic and brought disease control measures in quickly to stop the epidemic growth before it occurred.
  • The United Kingdom’s first use of restrictive measures was very slow, and they have since been playing catch-up relative to the size of the epidemic.
  • In the US, there was almost no attempt at control until long after the start of the epidemic. Belatedly, more stringent measures have been applied, but these are still substantially less restrictive than those of France or Italy.
  • China’s pattern is more complicated. Since they were the first country affected by the outbreak, it’s perhaps understandable that their initial response was slow. Their subsequent response was rapid, though, enabling a subsequent reduction in stringency, which has more recently been raised again – presumably in an attempt to prevent a second wave of the epidemic. The maximum stringency index is considerably lower than that of France or Italy, presumably because although their measures were more restrictive, they were localised in severity to the hardest-hit province of Hubei.

One might quibble about the actual definitions used for the stringency index, but these conclusions broadly chime with common perceptions about the efficacy of different government responses to the epidemic.

Numbers and pictures

Statistics is playing a fundamental role in supporting decision-makers by providing predictions of the Coronavirus epidemic spread and of the likely impact of possible courses of action they could take. Nothing is certain – from the transmission of the disease, to the way individuals will behave – which is why probability theory plays such an important role. We can’t be sure certain things will happen, but we can reasonably assign probabilities to them.

But at a more elementary level, clear presentation of data in both numerical and graphical form, is also important for understanding many characteristics of the epidemic. There are now various sources of well-presented information, and I thought it might be helpful to provide a list here of the best one’s I’ve found so far. If anyone has alternative sources, please send them to me or include them in the comments below and I’ll add them to the list


This page gives current counts of various types – including new cases – per country. It also includes simple graphics that track the epidemic evolution. There are links to each individual country, where a country-specific history of numbers is available, and also links to look at effects by age, sex and so on. Graphs and so on are updated daily, but the numbers themselves are updated every time a country releases new daily data.


This page is updated daily and gives very clear graphics of a number of aspects of the epidemic. It shows, for example, slight differences in the age distribution of mortalities for Italy and China and also compares the mortality and contagion rate for this epidemic against those of other epidemics and diseases.


This is a dashboard giving numbers and a geo-graphical display of current cases. A more detailed UK-specific version of the dashboard is also hosted here.


This is a similar country-specific dashboard, but for Italy.


The Financial Times gives this comparison of the epidemic growth across countries. It’s updated daily, though sometimes I can’t get past a paywall. Similar figures are available anyway in the dashboards above.

Like I say, please let me know of any other useful sources and I’ll add them to the list.


This page has updated graphs that allow you to compare the trajectory of the epidemic in specified countries over different timescales and on different scales.


One of the difficulties in determining how to respond to the Coronavirus epidemic is a lack of evidence with which to base decisions. But since some countries – most notably China, South Korea and Italy – are ahead of the UK’s trajectory, there are lessons emerging which could inform decision-makers.

For example: what is the effect of placing areas in a lockdown?

It’s an imperfect analysis, but the following graph shows the trajectory of the total number of cases in two provinces of Lombardia.

The regions are broadly similar geographically and demographically, so it’s not totally unreasonable to consider them as equivalent when making comparisons. However, Lodi started with more cases and was placed in a  state of lockdown as of 23 February. Bergamo was also placed in lockdown, but considerably later, on 8 March.

Looking at the figure, though the number of cases has grown in both cases, in Lodi the growth is more or less linear, with signs of levelling off. In Bergamo, the growth appears to have started at an exponential rate, with a change to linear growth soon after the lockdown there, but with a steeper rate than that of Lodi. If it’s fair to make a direct comparison between these two provinces, there’s strong evidence that locking down early has a considerable impact on an epidemic’s growth.

Of course, there are a number of other factors to take account of, some of which may favour Bergamo over Lodi. People’s freedom of movement has been maintained for a longer period in Bergamo, and the effect on the economy is likely to be slighter – at least in the period shown in the graph. And we don’t know what will happen in the future – maybe when things are eventually relaxed the number of cases will grow faster in Lodi than in Bergamo.

Nonetheless, by the strict measure of short-term growth of epidemic, the evidence here is that an early and comprehensive lockdown is an effective strategy in containing numbers of new cases.

As I’m writing this I’ve just heard Boris Johnson announce a nationwide voluntary restriction on social contact in the UK. Time will tell whether this is stringent enough to get the same braking effect on the epidemic growth as was achieved in Lodi or if – as in Italy nationally from last week – a legally enforceable version of a lockdown will prove necessary.