Terrible maps

One of the themes in this blog has been the creative  use of diagrams to represent statistical data. When the data are collected geographically this amounts to using maps to represent data – perhaps using colours or shadings to show how a variable changes over a region, country or even the whole world.

With this in mind I recommend to you @TerribleMaps on twitter.

It’s usually entertaining, and sometimes – though not always – scientific. Here are a few recent examples:

  1. Those of you with kids are probably lamenting right now the length of the summer holidays. But just look how much worse it could be if, for example, you were living in Italy (!):
  2. Just for fun… a map of the United States showing the most commonly used word in each state:
  3. A longitudinal slicing of the world by population size. It’s interesting because the population per size will depend both on the number of countries that are included as well as the population density in those slices.
  4. For each country in the following map, the flag shown is that of the country with which it shares the longest border. For example, the UK has its longest border with Ireland, and so is represented by the Ireland flag. Similarly, France’s flag is that of Brazil!
  5. This one probably only makes sense if you were born in, or have spent time living in, Italy
  6. While this one will help you get clued-up on many important aspects of UK culture:
  7. And finally, this one will help you understand how ‘per capita’ calculations are made. You might notice there’s one country with an N/A entry. Try to identify which country that is and explain why its value is  missing.

In summary, as you’ll see from these examples, the maps are usually fun, sometimes genuinely terrible, but sometimes contain a genuine pearl of statistical or geographical wisdom. If you have to follow someone on twitter, there are worse choices you could make.

 

 

Zipf it

In a recent post I explained that in a large database of containing the words from many English language texts of various types, the word ‘football’ occurred 25,271 times, making it the 1543rd most common word in the database. I also said that the word ‘baseball’ occurred 28,851 times, and asked you to guess what its rank would be.

With just this information available, it’s impossible to say with certainty what the exact rank will be. We know that ‘baseball’ is more frequent than ‘football’ and so it must have a higher rank (which means a rank with a lower number). But that simply means it could be anywhere from 1 to 1542.

However, we’d probably guess that ‘baseball’ is not so much more popular a word than ‘football’; certainly other words like ‘you’, ‘me’, ‘please’ and so on are likely to occur much more frequently. So, we might reasonably guess that the rank of ‘baseball’ is closer to the lower limit of 1542 than it is to the upper limit of 1. But where exactly should we place it?

Zipf’s law provides a possible answer.

In its simplest form Zipf’s law states that for many types of naturally occurring data – including frequencies of word counts – the second most common word occurs half as often as the most common; the third most common occurs a third as often as the most popular; the fourth most common occurs a quarter as often; and so on. If we denote by f(r) the frequency of the item with rank r, this means that

f(r) = C/r

or

r\times f(r)=C,

where C is the constant f(1). And since this is true for every choice of r, the frequencies and ranks of the words ranked r and s are related by

r\times f(r)=s \times f(s).

Then, assuming Zipf law applies,

rank(\mbox{`baseball'}) = rank(`football') \times f(\mbox{`football'})/f(\mbox{`baseball'})

= 1543 \times 25271/28851 \approx 1352

So, how accurate is this estimate? The database I extracted the data from is the well-known Brown University Standard Corpus of Present-Day American EnglishThe most common 5000 words in the database, together with their frequencies, can be found here. Searching down the list, you’ll find that the rank of ‘baseball’ is 1380, so the estimated value of 1352 is not that far out.

But where does Zipf’s law come from? It’s named after the linguist George Kingsley Zipf (1902-1950), who observed the law to hold empirically for words in different languages. Rather like Benford’s law, which we discussed in an earlier post, different arguments can be constructed that suggest Zipf’s law might be appropriate in certain contexts, but none is overwhelmingly convincing, and it’s really the body of empirical evidence that provides its strongest support.

Actually, Zipf’s law

f(r) = C/r,

is equivalent to saying that the frequency distribution follows a power law where the power is equal to -1. But many fits of the model to data can be improved by generalising this model to

f(r)=C/r^k

for some constant k. In this more general form the law has been shown to work well in many different contexts, including sizes of cities, website access counts, gene expression frequencies and strength of volcanic eruptions. The version with k=1 is found to work well for many datasets based on frequencies of word counts, but other datasets often require different values of k. But to use this more general version of the law we’d have to know the value of k, which we could estimate if we had sufficient amounts of data. The simpler Zipf’s law has k=1 implicitly, and so we were able to estimate the rank of ‘baseball’ with just the limited amount of information provided.

Finally, I had just 3 responses to the request for predictions of the rank of ‘baseball’: 1200, 1300 and 1450, each of which is entirely plausible. But if I regard each of these estimates as those of an expert and try combining those expert opinions by taking the average I get 1317, which is very close to the Zipf law prediction of 1352. Maybe if I’d had more replies the average would have been even closer to the Zipf law estimate or indeed to the true answer itself 😏.

 

Data controversies

 

Some time ago I wrote about Mendel’s law of genetic inheritance, and how statistical analysis of Mendel’s data suggested his results were too good to be true. It’s not that his theory is wrong; it’s just that the data he provided as evidence for his theory seem to have been manipulated in such a way as to seem incontrovertible. Unfortunately the data lack the variation that Mendel’s own law would also imply should occur in measurements of that type, leading to the charge that the data had been manufactured or manipulated in some way.

Well, there’s a similar controversy about the picture at the top of this page.

The photograph, taken 100 years ago, was as striking at that time as the recent picture of a black hole, discussed in an earlier post, is today. However, this picture was taken with basic photographic equipment and telescopic lens and shows a total solar eclipse, as the moon passes directly between the Earth and the Sun.

A full story of the controversy is given here.

In summary: Einstein’s theory of general relativity describes gravity not as a force between two attracting masses – as is central to Newtonian physics – but as a curvature caused in space-time due to the presence of massive objects. All objects cause such curvature, but only those that are especially massive, such as stars and planets, will have much of an effect.

Einstein’s relativity model was completely revolutionary compared to the prevailing view of physical laws at the time. But although it explained various astronomical observations that were anomalous according to Newtonian laws, it had never been used to predict anomalous behaviour. The picture above, and similar ones taken at around the same time, changed all that.

In essence, blocking out the sun’s rays enabled dimmer and more distant stars to be accurately photographed. Moreover, if Einstein’s theory were correct, the photographic position of these stars should be slightly distorted because of the spacetime curvature effects of the sun. But the effect is very slight, and even Newtonian physics suggests some disturbance due to gravitational effects.

In an attempt to get photographic evidence at the necessary resolution, the British astronomer Arthur Eddington set up two teams of scientists – one on the African island of Príncipe, the other in Sobral, Brazil – to take photographs of the solar eclipse on 29 May, 1919. Astronomical and photographic equipment was much more primitive in those days, so this was no mean feat.

Anyway, to cut a long story short, a combination of poor weather conditions and other setbacks meant that the results were less reliable than were hoped for. It seems that the data collected at Príncipe, where Eddington himself was stationed, were inconclusive, falling somewhere between the Newton and Einstein model predictions. The data at Sobral were taken with two different types of telescope, with one set favouring the Newton view and the other Einstein’s. Eddington essentially combined the Einstein-favouring data from Sobral together with those from Príncipe and concluded that the evidence supported Einsteins relativistic model of the universe.

Now, in hindsight, with vast amounts of empirical evidence of many types, we know Einstein’s model to be fundamentally correct. But did Eddington selectively choose his data to support Einstein’s model?

There are different points of view, which hinge on Eddington’s motivation for dropping a subset of the Sobral data from his analysis. One point of view is that he wanted Einstein’s view to be correct, and therefore simply ignored the data that were less favourable. This argument is fuelled by political reasoning: it sarges that since Eddington was a Quaker, and therefore a pacifist, he wanted to support a German theory as a kind of post-war reconciliation.

The alternative point of view, for which there is some documentary evidence, is that the Sobral data which Eddington ignored had been independently designated as unreliable. Therefore, on proper scientific grounds, Eddington had behaved entirely correctly by excluding it from his analysis, and his subsequent conclusions favouring the Einstein model were entirely consistent with the scientific data and information he had available.

This issue will probably never be fully resolved, though in a recent review of several books on the matter, theoretical physicist Peter Coles (no relation) claims to have reanalysed the data given in the Eddington paper using modern statistical methods, and found no reason to doubt his integrity. I have no reason to doubt that point of view, but there’s no detail of the statistical analysis that was carried out.

What’s interesting though, from a statistical point of view, is how the interpretation of the results depends on the reason for the exclusion of a subset of the Sobral data. If your view is that Eddington knew their contents and excluded them on that basis, then his conclusions in favour of Einstein must be regarded as biased. If you accept that Eddington excluded these data a priori because of their unreliability, then his conclusions were fair and accurate.

Data are often treated as a neutral aspect of an analysis. But as this story illustrates, the choice of which data to include or exclude, and the reasons for doing so, may be factors which fundamentally alter the direction an analysis will take, and the conclusions it will reach.

 

 

 

Word rank

I recently came across a large database of the use of English-American words. It aims to provide a representative sample of the usage English-American by including the words extracted from a large number of English texts of different types – books, newspaper articles, magazines etc. In total it includes around 560 million words collected over the years 1990-2017.

The word ‘football’ occurs in the database 25,271 times and has rank 1543. In principle, this means that ‘football’ was the 1543rd most frequent word in the database, though the method used for ranking the database elements is a little more complicated than that, since it attempts to combine a measure of both the number of times the word appears and the number of texts it appears in. Let’s leave that subtlety aside though and assume that ‘football’, with a frequency of 25,271, is the 1543rd most common word in the database.

The word ‘baseball’ occurs in the same database 28,851 times. With just this information, what would you predict the rank of the word ‘baseball’ to be? For example, if you think ‘baseball’ is the most common word, it would have rank 1. (It isn’t: ‘the’ is the most common word). If you think ‘baseball’ would be the 1000th most common word, your answer would be 1000.

Give it a little thought, but don’t waste time on it. I really just want to use the problem as an introduction to an issue that I’ll discuss in a future post. I’d be happy to receive your answer though, together with an explanation if you like, by mail. Or if you’d just like to fire an answer anonymously at me, without explanation, you can do so using this survey form.

 

Revel in the amazement

In an earlier post I included the following table:

As I explained, one of the columns contains the genuine land areas of each country, while the other is fake. And I asked you which is which.

The answer is that the first column is genuine and the second is fake. But without a good knowledge of geography, how could you possibly come to that conclusion?

Well, here’s a remarkable thing. Suppose we take just the leading digit of each  of the values. Column 1 would give 6, 2, 2, 1,… for the first few countries, while column 2 would give 7, 9, 3, 3,… It turns out that for many naturally occurring phenomena, you’d expect the leading digit to be 1 on around 30% of occasions. So if the actual proportion is a long way from that value, then it’s likely that the data have been manufactured or manipulated.

Looking at column 1 in the table, 5 out of the 20 countries have a population with leading digit 1; that’s 25%. In column 2, none do; that’s 0%. Even 25% is a little on the low side, but close enough to be consistent with 30% once you allow for discrepancies due to random variations in small samples. But 0% is pretty implausible. Consequently, column 1 is consistent with the 30% rule, while column 2 is not, and we’d conclude – correctly – that column 2 is faking it.

But where does this 30% rule come from? You might have reasoned that each of the digits 1 to 9 were equally likely – assuming we drop leading zeros – and so the percentage would be around 11% for a leading digit of 1, just as it would be for any of the other digits. Yet that reasoning turns out to be misplaced, and the true value is around 30%.

This phenomenon is a special case of something called Benford’s law, named after the physicist Frank Benford who first formalised it. (Though it had also been noted much earlier by the astronomer Simon Newcomb). Benford’s law states that for many naturally occurring datasets, the probability that the leading digit of a data item is 1 is equal to 30.1%. Actually, Benford’s law goes further than that, and gives the percentage of times you’d get a 2 or a 3 or any of the digits 1-9 as the leading digit. These percentages are shown in the following table.

Leading Digit 1 2 3 4 5 6 7 8 9
Frequency 30.1% 17.6% 12.5% 9.7% 7.9% 6.7% 5.8% 5.1% 4.6%

For those of you who care about such things, these percentages are log(2/1), log(3/2), log(4/3) and so on up to log(10/9), where log here is logarithm with respect to base 10.

But does Benford’s law hold up in practice? Well, not always, as I’ll discuss below. But often it does. For example, I took a dataset giving the altitudes of a large set of football stadiums around the world. I discarded a few whose altitude is below sea level, but was still left with over 13,000 records. I then extracted the leading digit of each of the altitudes (in metres)  and plotted a histogram of these values. This is just a plot of the percentages of occasions each value occurred. These are the blue bars in the following diagram. I then superimposed the predicted proportions from Benford’s law. These are the black dots.

 

The agreement between the observed percentages and those predicted by Benford’s law is remarkable. In particular, the observed percentage of leading digits equal to 1 is almost exactly what Benford’s law would imply. I promise I haven’t cheated with the numbers.

As further examples, there are many series of mathematically generated numbers for which Benford’s law holds exactly.

These include:

  • The Fibonacci series: 1, 1, 2, 3, 5, 8, 13, …. where each number is obtained by summing the 2 previous numbers in the series.
  • The integer powers of two: 1, 2, 4, 8, 16, 32, …..
  • The iterative series obtained by starting with any number and successively multiplying by 3. For example, starting with 7, we get: 7, 21, 63, 189,….

In each of these cases of infinite series of numbers, exactly 30.1% will have leading digit equal to 1; exactly 17.6% will have leading digit equal to 2, and so on.

And there are many other published examples of data fitting Benford’s law (here, here, here… and so on.)

Ok, at this point you should pause to revel in the amazement of this stuff. Sometimes mathematics, Statistics and probability come together in a way to explain naturally occurring phenomena that is so surprising and shockingly elegant it takes your breath away.

So, when does Benford’s law work. And why?

It turns out there are various ways of explaining Benford’s law, but none of them – at least as far as I can tell – is entirely satisfactory. All of them require a leap of faith somewhere to match the theory to real-life. This view is similarly expressed in an academic article, which concludes:

… there is currently no unified approach that simultaneously explains (Benford’s law’s) appearance in dynamical systems, number theory, statistics, and real-world data.

Despite this, the various arguments used to explain Benford’s law do give some insight into why it might arise naturally in different contexts:

  1. If there is a law of this type, Benford’s law is the only one that works for all choices of scale. The decimal representation of numbers is entirely arbitrary, presumably deriving from the fact that humans, generally, have 10 fingers. But if we’d been born with 8 fingers, or chosen to represent numbers anyway in binary, or base 17, or something else, you’d expect a universal law to be equally valid, and not dependent on the arbitrary choice of counting system. If this is so, then it turns out that Benford’s law, adapted in the obvious way to the choice of scale, is the only one that could possibly hold. An informal argument as to why this should be so can be found here.
  2. If the logarithm of the variable under study has a distribution that is smooth and roughly symmetric – like the bell-shaped normal curve, for example – and is also reasonably well spread out, it’s easy to show that Benford’s law should hold approximately. Technically, for those of you who are interested, if X is the thing we’re measuring, and if log X has something like a normal distribution with a variance that’s not too small, then Benford’s law is a good approximation for the behaviour of X. A fairly readable development of the argument is given here. (Incidentally, I stole the land area of countries example directly from this reference.)

But in the first case, there’s no explanation as to why there should be a universal law, and indeed many phenomena – both theoretical and in nature – don’t follow Benford’s law. And in the second case, except for special situations where the normal distribution has some kind of theoretical justification as an approximation, there’s no particular reason why the logarithm of the observations should behave in the required way. And yet, in very many cases – like the land area of countries or the altitude of football stadiums – the law can be shown empirically to be a very good approximation to the truth.

One thing which does emerge from these theoretical explanations is a better understanding of when Benford’s law is likely to apply and when it’s not. In particular, the argument only works when the logarithm of the variable under study is reasonably well spread out. What that means in practice is that the variable itself needs to cover several orders of magnitude: tens, hundreds, thousands etc. This works fine for something like the stadium altitudes, which vary from close to sea-level up to around 4,000 metres, but wouldn’t work for total goals in football matches, which are almost always in the range 0 to 10, for example.

So, there are different ways of theoretically justifying Benford’s law, and empirically it seems to be very accurate for different datasets which cover orders of magnitude. But does it have any practical uses? Well, yes: applications of Benford’s law have been made in many different fields, including…

Finally, there’s also a version of Benford’s law for the second digit, third digit and so on. There’s an explanation of this extension in the Wikipedia link that I gave above. It’s probably not easy to guess exactly what the law might be in these cases, but you might try and guess how the broad pattern of the law changes as you move from the first to the second and to further digits.


Thanks to those of you wrote to me after I made the original post. I don’t think it was easy to guess what the solution was, and indeed if I was guessing myself, I think I’d have been looking for a uniformity in the distribution of the digits, which turns out to be completely incorrect, at least for the leading digit. Even though I’ve now researched the answer myself, and made some sense of it, I still find it rather shocking that the law works so well for an arbitrary dataset like the stadium altitudes. Like I say: revel in the amazement.

Faking it

 

Take a look at the following table:

fake_data

 

It shows the total land area, in square kilometres, for various countries. Actually, it’s the first part of a longer alphabetical list of all countries and includes two columns of figures, each purporting to be the corresponding area of each country. But one of these columns contains the real areas and the other one is fake. Which is which?

Clearly, if your knowledge of geography is good enough that you know the land area of Belgium – or any of the other countries in the table – or whether Bahrain is bigger than Barbados, then you will know the answer. You could also cheat and check with Google. But you can answer the question, and be almost certain of being correct, without cheating and without knowing anything about geography. Indeed, I could have removed the first column giving the country names, and even not told you that the data correspond to land areas, and you should still have been able to tell me which column is real and which is fake.

So, which column is faking it? And how do you know?

I’ll write a follow-up post giving the answer and explanation sometime soon. Meantime, if you’d like to write to me giving your own version, I’d be happy to hear from you.

 

Midrange is dead

Kirk Goldsberry is the author of a new book on data analytics for NBA. I haven’t read the book, but some of the graphical illustrations he’s used for its publicity are great examples of the way data visualization techniques can give insights about the evolution of a sport in terms of the way it is played.

 

Press the start button in the graphic of the above tweet.. I’m not sure exactly how the graphic and the data are mapped, but essentially the coloured hexagons show regions of the basketball court which are the most frequent  locations for taking shots. The animation shows how this pattern has changed over the seasons.

As you probably know, most goals in basketball – excluding penalty shots – are awarded 2 points. But a shot that’s scored from outside a distance of 7.24m from the basket – the almost semi-circular outer-zone shown in the figure – scores 3 points. So, there are two ways to improve the number of points you are likely to score when shooting: first, you can get closer to the basket, so that the shot is easier; or second, you can shoot from outside the three-point line, so increasing the number of points obtained when you do score. That means there’s a zone in-between, where the shot is still relatively difficult because of the distance from the basket, but for which you only get 2 points when you do score. And what the animation above clearly shows is an increasing tendency over the seasons for players to avoid shooting from this zone. This is perhaps partly because of a greater understanding of the trade-off between difficulty and distance, and perhaps also because improved training techniques have led to a greater competency in 3-point shots.

Evidence to support this reasoning is the following data heatmap diagram which shows the average number of points scored from shots taken at different locations on the court. The closer to red, the higher the average score per shot.

Again the picture makes things very clear: average points scored are highest when shooting from very close to the basket, or from outside of the 3-point line. Elsewhere the average is low. It’s circumstantial evidence, but the fact that this map of points scored has patterns that are so similar to the current map of where players are shooting from, there’s a strong suggestion that players have evolved their play style in order to shoot at the basket from positions which they know are more likely to generate the most points.

In summary, creative use of both static and animated graphical data representations provide great insights about the way basketball play has evolved, and why that evolution is likely to have occurred, given the 3-point shooting rule.


Thanks to Benoit.Jottreau@smartodds.co.uk for posting something along these lines on RocketChat.

Animal experiments

Ever thought your cat might be trolling you? Turns out you’re right. As explained in this New Scientist article, recent Japanese research concludes that cats are entirely capable of recognising their names; they just choose not to when it suits them.

The full details of the experiment are included in a research report published in Nature. It’s an interesting, though not entirely easy, read. But I’d like to use it to point out an aspect of statistical methodology that is often ignored: statistical analyses don’t usually start with the analysis of data; they start with the design of the experiment by which the data are to be collected. And it’s essential that an experiment is designed correctly in order to be able to use Statistics to answer the question you’re interested in.

So, in this particular study, the researchers carried out four separate experiments:

  • In experiment 1, the ability of cats to distinguish their own names from that of other similar nouns was tested;
  • In experiment 2, cats living with numerous other cats were tested to see if they could distinguish their own name from that of other cats in the same household;
  • Experiment 3 was like experiment 1, but using cats from a ‘cat cafe‘ (don’t ask) rather than a normal household;
  • Experiment 4 was also like experiment 1, but using a voice other than the cat’s owner to trigger the responses.

Through this sequence of experiments, the researchers were able to judge whether or not the cats genuinely recognise and respond to their own names in a variety of environments, and to exclude the possibility that the responses were due to factors other than actual name recognition. As such, this is a great example of how the design of an experiment has been carefully tailored to ensure that a statistical analysis of the data it generates is able to answer the question of interest.

I won’t go into details, but there are many other aspects of the experimental design that also required careful specification:

  1. The number of cats to be included in the study;
  2. The choice of words to use as alternative stimuli to the cats’ names, and the order in which they are used;
  3. The definitions of actions that are considered positive responses to stimuli;
  4. The protocol for determining whether a cat has responded positively to a stimuli or not;

amongst others. Full details are available in the Nature article, as indeed are the data, should you wish to analyse them yourself.

In the context of sports modelling, these kinds of issues are less explicit, since analyses are usually retrospective, using data that have already been historically collected and stored. Nonetheless, the selection of which data to include in an analysis can affect the analysis, and it’s important to ensure that results are not sensitive to specific, subjective choices. However, for analyses of data that include a decision process – such as betting strategies – it may well be relevant to formulate an experimental design for a prospective study, comparing results based on one type of strategy, compared with that of another. We’ll discuss strategies for this type of experiment in a future post.

 

Picture this

You can’t help but be amazed at the recent release of the first ever genuine image of a black hole. The picture itself, and the knowledge of what it represents, are extraordinary enough, but the sheer feat of human endeavour that led to this image is equally breathtaking.

Now, as far as I can see from the list of collaborators that are credited with the image, actual designated statisticians didn’t really contribute. But, from what I’ve read about the process of the image’s creation, Statistics is central to the underlying methodology. I don’t understand the details, but the outline is something like this…

Although black holes are extremely big, they’re also a long way away. This one, for example, has a diameter that’s bigger than our entire solar system. But it’s also at the heart of the Messier 87 galaxy, some 55 million light years away from Earth. Which means that when looking towards it from Earth, it occupies a very small part of space. The analogy that’s been given is that capturing the black hole’s image in space would be equivalent to trying to photograph a piece of fruit on the surface of the moon. And the laws of optics imply this would require a telescope the size of our whole planet.

To get round this limitation, the Event Horizon Telescope (EHT) program uses simultaneous signals collected from a network of eight powerful telescopes stationed around the Earth. However, the result, naturally, is a sparse grid of signals rather than a complete image. The rotation of the earth means that with repeat measurements this grid gets filled-out a little. But still, there’s a lot of blank space that needs to be filled-in to complete the image. So, how is that done?

In principle, the idea is simple enough. This video was made some years ago by Katie Bouman, who’s now got worldwide fame for leading the EHT program to produce the black hole image:

The point of the video is that to recognise the song, you don’t need the whole keyboard to be functioning. You just need a few of the keys to be working – and they don’t even have to be 100% precise – to be able to identify the whole song. I have to admit that the efficacy of this video was offset for me by the fact that I got the song wrong, but in the YouTube description of the video, Katie explains this is a common mistake, and uses the point to illustrate that with insufficient data you might get the wrong answer. (I got the wrong answer with complete data though!)

In the case of the music video, it’s our brain that fills in the gaps to give us the whole tune. In the case of the black hole data, it’s sophisticated and clever picture imaging techniques, that rely on the known physics of light transmission and a library of the patterns found in images of many different types. From this combination of physics and library of image templates, it’s possible to extrapolate from the observed data to build proposal images, and for each one find a score of how plausible that image is. The final image is then the one that has the greatest plausibility score. Engineers call this image reconstruction; but the algorithm is fundamentally statistical.

At least, that’s how I understood things. But here’s Katie again giving a much  better explanation in a Ted talk:

Ok, so much for black holes. Now, think of:

  1. Telescopes as football matches;
  2. Image data as match results;
  3. The black hole as a picture that contains information about how good football teams really are;
  4. Astrophysics as the rules by which football matches are played;
  5. The templates that describe how an image changes from one pixel to the next as a rule for saying how team performances might change from one game to the next.

And you can maybe see that in a very general sense, the problem of reconstructing an image of a black hole has the same elements as that of estimating the abilities of football teams. Admittedly, our football models are rather less sophisticated, and we don’t need to wait for the end of the Antarctic winter to ship half a tonne of hard drives containing data back to the lab for processing. But the principles of Statistics are generally the same in all applications, from black hole imaging to sports modelling, and everything in between.