Statty night

Apologies for the terrible pun in the title.

When I used to teach Statistics I tried to emphasise to students that Statistics is as much an art as a science. Statisticians are generally trying to make sense of some aspect of the world, and they usually have just some noisy data with which to try to do it. Sure, there are algorithms and computer packages they can chuck data into and get simple answers out of. But usually those answers are meaningless unless the algorithm/package is properly tailored to the needs of the specific problem. And there are no rules as to how that is best done: it needs a good understanding of the problem itself, an awareness of the data that are available and the creative skill to be able to mesh those things with appropriate statistical tools. And these are skills that are closer to the mindset of an artist than of a scientist.

But anyway… I recently came across the following picture which turns the tables, and uses Statistics to make art. (Or to destroy art, depending on your point of view). You probably recognise the picture at the head of this post as Van Gogh’s Starry Night, which is displayed at MOMA in New York.

By contrast, the picture below is a statistical reinterpretation of the original version of Starry Night, created by photographer Mario Klingemann through a combination of data visualisation and statistical summarisation techniques .

The Starry Night Pie Packed

As you can see, the original painting has been replaced by a collage of coloured circles, which are roughly the same colour as the original painting. But in closer detail, the circles have an interesting structure. Each is actually a pie chart whose slices in size and colour correspond the proportions of colours in that region of the original picture.

Yes, pointless, but kind of fun nonetheless. You can find more examples of Klingemann’s statistically distorted classical artworks here.

In similar vein… the diagram below, produced by artist Arthur Buxton, is actually a quiz. Each of the pie charts represents the proportions of the main colours in one of Van Gogh’s paintings. In other words, these pie charts represent the colour distributions over a whole Van Gogh painting, rather than just a small region of a picture, as in the painting above. The quiz is to identify which Van Gogh painting each of the pie charts refers to.

You can find a short description of Arthur Buxton’s process in developing this picture here.

There’s just a small snag: I haven’t been able to locate the answers. My guess is that the pie chart in column 2 of row 2 corresponds to Starry Night. And the one immediately to the left of that is from the Sunflower series. But that’s pretty much exhausted my knowledge of the works of Van Gogh. Let me know if you can identify any of the others and I’ll add them to a list below.


Dance, dance, dance…

Ever thought: ‘I’m pretty sure I would fully understand Statistics, if only a modern dance company would illustrate the techniques for me’?

I hope you get the idea of what I’m trying to do with this blog by now. Fundamentally, Statistics is a very intuitive subject, but that intuition is often masked by technicalities, so that from the outside the subject can seem both boring and impenetrable. The aim of all of my posts is to try to show that neither of those things is true: Statistics is both fascinating and easily understandable. And in this way, whatever your connection to Smartodds, you’ll be better equipped to understand the statistical side of the company’s operations.

Of course, I’m not the only person to try to de-mystify Statistics, and there are many books, blogs and other learning aids with similar aims.

With this in mind, I recently came across a rather unusual set of resources for learning Statistics: a series of dance videos whose aim is to explain statistical concepts through movement. Probably my ‘favourite’ is this one, which deals with the notions of sampling and standard error. You might like to take a look…

I think it fair to say that the comments on these videos on YouTube are mixed. One person wrote:

This way it makes complicated things look simpler. Very informative and useful. Loved it. 🙂

While another said:

this makes simple things look complicated but thanks anyway

So, I guess it depends on your perspective. I think I’m on the side of the latter commenter though: I’m pretty sure that in 5 minutes I could give a much clearer and more entertaining explanation of the issues this film is trying to address than the film does itself. But maybe that’s not the point. Perhaps the point is that different things hook different people in, and while personally I can’t think of a much more complicated way of thinking about issues of sampling and measuring accuracy, the dance perspective seems to work for some people.

Anyway, if you think this might be the key to help you unlock some of the mysteries of Statistics, you can find the full series of four videos here, covering topics like correlation and standard deviation. Enjoy.


Statistics by pictures

Generally speaking there are three main phases to any statistical analysis:

  1. Design;
  2. Execution;
  3. Presentation.

Graphical techniques play an important part in both the second and third phases, but the emphasis is different in each. In the second phase the aim is usually exploratory, using graphical representations of data summaries to hunt for structure and relationships that might subsequently be exploited in a formal statistical model. The graphs here tend to be quick but rough, and are intended more for the statistician than the client.

In the presentation phase the emphasis is a bit different, since the analysis has already been completed, usually involving some sort of statistical model and inference. In this case diagrams are used to highlight the results to clients or a wider audience in a way that illustrates most effectively the salient features of the analysis. Very often the strength of message from a statistical analysis is much more striking when presented graphically rather than in the form of numbers. Moreover, some statisticians have also developed the procedure into something of an art form, using graphical techniques not just to convey the results of the analysis, but also to put them back in the context from where the data derive.

One of my favourite exponents of this technique is Mona Chalabi, who has regular columns in the Guardian. among other places.

Here are a few of her examples:

Most Popular Dog Names in New York


A Complete History of the Legislation of Same-Sex Marriage 


The Most Pirated Christmas Movies


And last and almost certainly least…



Tell you what though… that looks a bit more than 16% to me, suggesting a rather excessive use of artistic licence in this particular case.