Love it or hate it

A while ago I wrote a post about the practice of advertistics – the use, and more often misuse, of Statistics by advertising companies to promote their products. And I referenced an article in the Guardian which included a number of examples of advertistics. One of these examples was Marmite.

You probably know the line: Marmite – you either love it or hate it. That’s an advertisitic in itself. And almost certainly provably incorrect – I just have to find one person who’s indifferent to Marmite.

But I want to discuss a slightly different issue. This ‘love or hate Marmite’ theme has turned up as an advertistic for a completely different product…

DNAfit is one of a number of do-it-yourself DNA testing kits. Here’s what they say about themselves:

DNAfit helps you become the best possible version of yourself. We promise a smarter, easier and more effective solution to health and fitness, entirely unique to your DNA profile. Whatever your goal, DNAfit will ensure you live a longer, happier and healthier life.

And here’s the eminent statistician, er, Rio Ferdinand, to persuade you with statistical facts as to why you should sign up with DNAfit.

But where’s the Marmite?

Well, as part of a campaign that was purportedly setup to address a decline in Marmite sales, but was coincidentally promoted as an advertistic for the DNAfit testing kit, a scientific project was set up to find genetic markers that identify whether a person will be a lover or hater of Marmite. (Let’s ignore, for the moment, the fact that the easiest way to discover if a person is a ‘lover’ or ‘hater’ of Marmite is simply to ask them.)

Here’s a summary of what they did:

  • They recruited a sample of 261 individuals;
  • For each individual, they took a DNA sample;
  • They also questioned the individuals to determine whether they love or hate Marmite;
  • They then applied standard statistical techniques to identify a small number of genetic markers that separate the Marmite lovers from the haters. Essentially, they looked for a combination of DNA markers which were present in the ‘haters’, but absent in the ‘lovers’ (or vice versa).

Finally, the study was given a sheen of respectability through the publication of a white paper with various genetic scientists as authors.

But, here’s the typical reaction of another scientist on receiving a press release about the study:

Wow, sorry about the language there. So, what’s wrong?

The Marmite gene study is actually pretty poor science. One reason, as explained in this New Scientist article, is that there’s no control for environmental factors. For example, several members of a family might all love Marmite because the parents do and introduced their kids to it at a very early age. The close family connection will also mean that these individuals have similar DNA. So, you’ll find a set of genetic characteristics that each of these family members have, and they all also love Marmite. Conclusion – these are genetic markers for loving Marmite. Wrong: these are genetic markers for this particular family who, because they share meals together, all love Marmite.

I’d guess there are other factors too. A sample of 261 seems rather small to me. There are many possible genetic markers, and many, many more combinations of genetic markers. With so many options it’s almost certain that purely by chance in 261 individuals you can find one set of markers shared only by the ‘lovers’ and another set shared only by the ‘haters’. We’ve seen this stuff before: look at enough things and something unlikely is bound to occur just by chance. It’s just unlikely to happen again outside of the sample of individuals that took part in the study.

Moreover, there seems to have been no attempt at validating the results on an independent set of individuals.

Unfortunately for DNAfit and Marmite, they took the campaign one stage further and encouraged Marmite customers – and non-customers – to carry out their own DNA test to see if they were Marmite ‘lovers’ or ‘haters’ using the classification found in the genetic study. If only they’d thought to do this as part of the study itself. Because although the test claimed to be 99.98% accurate, rather many people who paid to be tested found they’d been wrongly classified.

One ‘lover’ who was classified as a ‘hater’ wrote:

I was genuinely upset when I got my results back. Mostly because, hello, I am a ‘lover’, but also because I feel like Marmite led me on with a cheap publicity tool and I fell for it. I feel dirty and used.

While a wrongly-classified ‘hater’ said:

I am somewhat offended! I haven’t touched Marmite since I was about eight because even just the thought of it makes me want to curl up into a ball and scrub my tounge.

Ouch! ‘Dirty and used’. ‘Scrub my tongue’. Not great publicity for either Marmite or DNAfit, and both companies seem to have dropped the campaign pretty quickly and deleted as many references to it as they were able.

Ah, the price of doing Statistics badly.

p.s. There was a warning in the ads about a misclassification rate higher than 0.02% but they just dismissed it as fake news…



A day in the life

Over the next few weeks I’m planning to include a couple of posts looking at the way Statistics gets used – and often misused – in the media.

First though, I want to emphasise the extent to which Statistics pervades news stories. It’s everywhere. But we’re so accustomed to this fact, we hardly pay attention. So, I chose a day randomly last year – when I first planned this post – and made a note of all the articles that I came across which were based one way or another on Statistics.

In no particular order….

Article 1: An analysis of the ways the economy had been affected to date since the Brexit referendum.

Article 2: A report in your super soaraway Sun about research which shows 40% of the British population don’t hold cutlery correctly. (!)

Article 3: A BBC report about a study into heart defects and regeneration rates in Mexican tetra fish which may offer clues to help reduce heart disease rates in humans.

Article 4: A report showing that children’s school performance may be affected by their exact age on entry.

Article 5: A report into the rates of prescriptions of anti-depressants to children and the possible consequences of this.

Article 6: A survey of the number of teenage gamblers.

Article 7: A report on projections of the numbers of people who could be affected by future insulin shortages.

Article 8: A report on a study that suggests children’s weights are not driven by patterns of parental feeding, but rather the opposite: parents tend to adapt feeding patterns to the natural weight of their children.

Article 9: A comparison of football teams in terms of performance this season relative to last season.

Article 10: Not really about statistics exactly, but a report showing that the UK’s top-paid boss is Denise Coates, the co-founder of Bet365, who has just had a pay-rise of £265. Inludes a nice graphic showing how her salary has risen year-on-year.

Article 11: Report on a study showing failure rates of cars in MOT tests due to excessive emission rates.

Article 12: A report into an increase in the rate of anti-depressant prescriptions following the EU referendum.

Article 13: A report on rates of ice-melt in Antartica that suggest a sub-surface radioactive source.

Article 14: A report suggesting rats are getting bigger and what the implications might be.

Article 15: An explanation of algorithms that can distinguish between human and bot conversations.

Article 16: A report suggesting that global internet growth is slowing.

So that’s 16 articles in the papers I happened to look at on a random day. Pretty sure I could have picked any day and any set of papers and it would have been a similar story.

Now here’s a challenge: choose your own day and scan the papers (even just the online versions) to see how many stories have an underlying statistical content. And if you find something that’s suitable for the blog, please pass it on to me – that would be a great bonus.

When I was a kid I went on a school exchange trip to Germany. For some reason we had a lesson with our German hosts in which we were asked to explain the meaning of the Beatles’ ‘A Day in the Life’….

Embarrassingly, I think I tried to give a  literal word-by-word interpretation. But if I’d known then what I know about Statistics now, I think I could probably have made a better effort.

Here are the lyrics from one of the verses…

Ah I read the news today, oh boy
Four thousand holes in Blackburn, Lancashire
And though the holes were rather small
They had to count them all
Now they know how many holes it takes to fill the Albert Hall

Family problems

In an earlier post, I set the following problem:

You’re at a party and meet someone. After chatting for a bit, you work out that the girl you’re talking to has 2 kids and one of them is a boy. What are the chances she’s got 2 boys rather than a boy and a girl?

Following some feedback, I later updated that post to clarify that the question isn’t intended to be about possible slight differences in the birth rates of boys and girls. That’s an interesting biological and demographic issue, but wasn’t intended as the point of the question. For the purposes of the question I simply meant to assume that all children, regardless of a mother’s previous history of births, is equally likely to be a boy or a girl.

In that case, it’s very tempting to answer the question above as 1/2. Indeed, this was the point of the post. One child’s a boy. The other is just as likely to be a boy or a girl, so the answer must be 1/2.

Except it’s not.

The answer is 1/3, and here’s the reasoning…

Without any additional information, if a woman has a 2-child family the possibilities (with the oldest child listed first) are:

Boy- Boy, Boy-Girl, Girl-Boy, Girl-Girl

and because all children are equally likely to be male or female, these combinations are all equally likely. But we can rule out the Girl-Girl combination from the information in the problem, so the remaining possibilities are

Boy-Boy, Boy-Girl, Girl-Boy

with each being equally likely. But if you consider a boy in one of these pairs, it’s only in the first case that the other child is a boy. So, in just 1 of the 3 equally likely outcomes is the other child also a boy, so the probability is 1/3.

This simple problem illustrates the difficulty in calculating what is called a conditional probability – the probability of something conditional on some given information, in this case that one of the children is a boy. Whereas in general the chance of a child being a boy is 1/2, once you include the extra information that we’re looking at a 2-child family in which at least one of the children is a boy, the probability changes. At first it seems counter-intuitive, but once you’ve seen a few problems of this type, your intuition becomes sharper.

With that in mind, let me pose the same problem as above, but suppose you find out that one of the woman’s kids is a boy that was born on a Tuesday. Now what’s the probability that the other child is also a boy?

As usual, I’ll write a future post with the solution and discussion. But if you’d like to send me your own ideas, either by mail or via this survey form, I’d be really happy to hear from you.

Thanks to those of you who replied to the original question. Apart from some initial confusion about whether I was suggesting boys might be more or less common than girls in general, there was roughly a 50-50 split in answers between 1/2 and 1/3. As explained above, it’s easy to be misled into thinking the answer is 1/2, so there is no embarrassment in arriving at that answer. And like I say, that was the whole point of the post. Nonetheless, well done to those of you who got the correct answer of 1/3.

It’ll be interesting to see what replies I get to the revised problem above, so please do send me your answer or thoughts on the problem if you have time.

Weapons of math destruction

I haven’t read it, but Cathy O’Neil’s ‘Weapons of Math Destruction‘  is a great title for a book. Here’s what one reviewer wrote:

Cathy O’Neil an experienced data scientist and mathematics professor illustrates the pitfalls of allowing data scientists to operate in a moral and ethical vacuum including how the poor and disadvantaged are targeted for payday loans, high cost insurance and political messaging on the basis of their zipcodes and other harvested data.

So, WOMD shows how the data-based algorithms that increasingly form the fabric of our lives – from Google to Facebook to banks to shopping to politics – and the statistical methodology behind them are actually pushing societies in the direction of greater inequality and reduced democracy.

At the time of writing WOMD these arguments were still in their infancy; but now we are starting to live the repercussions of the success of the campaign to remove Britain from the EU – which was largely driven by a highly professional exercise in Data Science – they seem much more relevant and urgent.

Anyway, Cathy O’Neil herself recently gave an interview to Bloomberg. Unfortunately, you now have to subscribe to read the whole article, so you won’t see much if you follow the link. But it was an interesting interview for various reasons. In particular, she discussed the trigger which led her to a love of data and mathematics. She wrote that when she was nine her father showed her a mathematics puzzle. And solving that problem led Cathy to a lifelong appreciation of the power of mathematical thinking. She wrote..

… I’ve never felt more empowered by anything since.

It’s more of a mathematical than a statistical puzzle, but maybe you’d like to think about it for yourself anyway…

Consider this diagram:

It’s a chessboard with 2 of the corner squares removed. Now, suppose you had a set of 31 dominoes, with each domino being able to cover 2 adjacent horizontal or vertical squares. Your aim is to find a way of covering the 62 squares of the mutilated board with the 31 dominoes. If you’d like to try it, mail me with either a diagram or photo of your solution; or, if you think it can’t be done, mail me an explanation. I’ll discuss the solution in a future post.




Here’s a problem for you. You’re an executive member of a medium-sized company. You have quite a few employees whose livelihoods are dependent on the ongoing success of the company. The company is performing reasonably well, but someone high up in the company – let’s, for argument’s sake, say the company’s owner – is a bit of a loose cannon. Maybe he’s prone to say the wrong thing in the wrong place sometimes. Maybe he’s got a skeleton or two in his cupboard that are best kept well-hidden. And bad publicity could badly damage the reputation and value of the company, potentially costing money and jobs. What are you going to do?

Well, it turns out you need the help of statisticians.

You’ll know all about car insurance. You pay a premium, whose cost is calculated on the basis of a number of factors including your likelihood of having an accident, the value of the car, the rate of claims in the area you live, and so on. And if you have an accident or your car is stolen, then you can claim against the insurance policy. It’s a negative value bet – on average you will pay out more money in premiums than you will regain in claims – but to protect yourself against the huge losses that might be incurred by writing-off your car, or in the damages you might cause to a third party, it’s a bet you would probably take. Actually, it’s a bet you’re legally obliged to make if you want to drive a car.

But how are the risks evaluated and the prices set? Essentially on the basis of  statistical models. An insurance company will have a record of previous claims and the individual and demographic characteristics of the customers making those claims. It’s then a fairly standard statistical modelling procedure to relate the chance of a customer making a claim, and the average cost when they do, to the available characteristics.

We met something like this before in the context of expected goals (xG). In that setting we had a number of characteristics on a game play and wanted to calculate the probability a goal would be scored. Swap game state for customer characteristics and goal-scored for claim-made and you can see the problem is structurally the same. Well, almost. A game play can only lead to a single goal, whereas an insurance customer might make several claims in given period. But essentially the principle is the same: use the characteristic information – game play or customer type – to get the best predictor of some outcome – goal scored or claim made.

But, I digress. It turns out that just like protecting yourself through insurance against the potential costs of a car accident, you can protect your company against the potential embarrassment of bad behaviour by any of its employees. Or owners.

Welcome to the world of: disgrace insurance.

Yes, it turns out that you can insure your company against the fallout of bad headlines caused by any disgraceful behaviour by the members of your company. This type of insurance has apparently been around for quite a while, but the avalanche of recent celebrity scandals and a shift in funding mechanisms has altered the dynamics. Leading the way now is the start-up company SpottedRisk. They say of themselves:

SpottedRisk™ has completely reinvented the decades-old disgrace insurance product in order to meet the needs of today’s market.

What’s especially interesting here from a statistical point of view is the risk evaluation aspect. SpottedRisk have amassed a database of some 27,000 celebrities and used various metrics of their behaviour as predictors for subsequent scandals. Then, like customer characteristics and insurance claims, or game position and goal scored, they can build a model to use one to predict the other. And once they’ve evaluated the risk of a scandal and it’s likely cost, they can set the premium accordingly.

The amount paid is after a scandal depends on its severity. Or what SpottedRisk call the ‘Tier of Outcry‘. And they give some theoretical examples:

  • Roseanne Barr. Sent a number of racist and conspiracy-theory tweets and was dropped from her own show. Tier of outcry level 2. Payout $6 million.
  • Kevin Spacey. Accused by several men, some underage, of sexual harassment. Dropped from various film productions and other work activities. Tier of outcry level 4. Payout $8 million.
  • Harvey Weinstein. Industrial amounts of sexual misconduct. Persona non grata pretty much everywhere. Tier of outcry level 5. Payout $10 million.

But there’s just something I don’t quite get with this business model. A celebrity will be publicly disgraced on the occurrence of two events:

  1. He/she will have done something disgraceful;
  2. That disgraceful thing will come to light and be publicised.

Now, the celebrity and the insurance company can each make an assessment about how likely the second of these, but the celebrity is likely to have much better knowledge than the insurance company about whether they really have something to hide – that’s to say whether the first of these points is triggered. So the value of an insurance premium is much better known to the customer than to the company, who can only have a vague idea of 1, even if they can calculate 2 better than the celebrity. This is unlike car insurance, where the company is probably better able to evaluate a customer’s total risk than the customer themselves. As such a client here is in the unusual position of knowing whether the premium offered is of good value or not. This doesn’t really make much sense to me.

Additionally, the theoretical payout on Harvey Weinstein is $10 million. This is probably a fraction of the amount spent on any of the films whose production he was involved in, and it seems fanciful to think that a film studio would have bothered to insure itself against that amount of loss.

So, to my mind, something doesn’t quite add up.

Finally: is everyone insurable against disgrace? Apparently yes, except for R. Kelly and Donald Trump, the latter of whom would “probably trigger a claim every week”, according to SpottedRisk’s behavioural scientist Pete Dearborn.

The opening paragraph of this blog post is a work of fiction. Names, characters, businesses, places, events, locales, and incidents are either the products of the author’s imagination or used in a fictitious manner. Any resemblance to actual persons, living or dead, or actual events is purely coincidental.







In a recent  Guardian article, Arwa Mahdawi defines something that she jokingly calls ‘advertistics’. These are statistics based on surveys that are designed to generate results that are useful for advertising a particular product. This might be achieved in different ways, including:

  • The question might be asked in a pointed way which steers respondents in a particular direction;
  • The sample of individuals surveyed might be creatively chosen so that they are more likely to answer in a particular way;
  • An incentive might be offered to respondents who give particular answers;
  • Surveys might be ignored and repeated until the desired outcome is achieved;
  • The survey and the statistics might just be made up.

But whichever method is used, the results are presented as if they are genuinely representative of the wider population. These are advertistics.

One example referred to in Arwa’s Guardian article is a survey of Americans which concluded that 45% of Americans wear the same underpants for at least 2 consecutive days and that American men are 2.5 times as likely as women to have worn their underwear unchanged for more than a week. But here’s the catch: the survey was carried out by an underwear manufacturer, and the details of their survey design are unavailable. So, it’s impossible to know whether the individuals they sampled were genuinely representative of the wider American population, and therefore whether the 45% advertistic has any basis in reality. Nonetheless, it’s convenient for the underwear company to present it as if it does in order to strengthen their campaign for people to replace their underwear more frequently. By buying more of their products, of course.

Another example: I’m old enough to remember ads produced by the cat-food manufacturer Whiskas that claimed:

8 out of 10 cats prefer Whiskas.


  1. Nobody asked the cats; and
  2. Many owners didn’t reply.

So they were forced to change the tag line to:

8 out of 10 owners who expressed a preference said their cat prefers it.

Definitely not as snappy, though scientifically more correct. Yet without further details on exactly how the survey was conducted, doubts remain about the validity of the 8 out of 10 advertistic even with the added caveats.

Finally, remember that things can change in time, and statistics – and advertisitcs – will change accordingly. Arguably the most famous advertistic of all time is the ‘fact’ that Carlsberg is…

Probably the best beer in the world

Except, shockingly, it no longer is. The latest Carlsberg campaign includes the admission that Carlsberg is

Probably not the best beer in the world.

Which to believe? Well, the new campaign comes with evidence supplied by Carlsberg drinkers including the claims that

Carlsberg tastes like stale breadsticks

and that drinking Carlsberg is like…

… drinking the bathwater your nan died in

So, on the strength of evidence, we’re going to have to accept that Carlsberg’s not the best.



Fringe benefits

The Edinburgh Fringe Festival is the largest arts festival in the world. The 2019 version has just finished, but Wikipedia lists some of the statistics for the 2018 edition:

  1. the festival lasted 25 days;
  2. it included more than 55,0000 performances;
  3. that comprised 3548 different shows.

The shows themselves are of many different types, including theatre, dance, circus and music. But the largest section of the festival is comedy, and performers compete for the Edinburgh Comedy Awards – formerly known as the Perrier Award – which is given to the best comedy show on the fringe.

I mention all this because the TV Channel Dave also publishes what it regards to be the best 10 jokes of the festival. And number 4 this year was a statistical joke.


A cowboy asked me if I could help him round up 18 cows. I said, “Yes, of course. That’s 20 cows.”

Confession: the joke is really based on arithmetic rather than Statistics.

About a boy

You’re at a party and meet someone. After chatting for a bit, you work out that the girl you’re talking to has 2 kids and one of them is a boy. What are the chances she’s got 2 boys rather than a boy and a girl?

Actually, I really want to ask a slightly more complicated question than this. But let’s take things slowly. Please think about this problem and, if you have time, mail me or send me your answer via this form. Subsequently, I’ll discuss the answer to this problem and ask you the slightly more complicated question that I’m really interested in.

Quick update: just to be clear, assume that all children are equally likely to be born male or female, and that this doesn’t change even if a mother has already given birth to previous children of known gender.

Terrible maps

One of the themes in this blog has been the creative  use of diagrams to represent statistical data. When the data are collected geographically this amounts to using maps to represent data – perhaps using colours or shadings to show how a variable changes over a region, country or even the whole world.

With this in mind I recommend to you @TerribleMaps on twitter.

It’s usually entertaining, and sometimes – though not always – scientific. Here are a few recent examples:

  1. Those of you with kids are probably lamenting right now the length of the summer holidays. But just look how much worse it could be if, for example, you were living in Italy (!):
  2. Just for fun… a map of the United States showing the most commonly used word in each state:
  3. A longitudinal slicing of the world by population size. It’s interesting because the population per size will depend both on the number of countries that are included as well as the population density in those slices.
  4. For each country in the following map, the flag shown is that of the country with which it shares the longest border. For example, the UK has its longest border with Ireland, and so is represented by the Ireland flag. Similarly, France’s flag is that of Brazil!
  5. This one probably only makes sense if you were born in, or have spent time living in, Italy
  6. While this one will help you get clued-up on many important aspects of UK culture:
  7. And finally, this one will help you understand how ‘per capita’ calculations are made. You might notice there’s one country with an N/A entry. Try to identify which country that is and explain why its value is  missing.

In summary, as you’ll see from these examples, the maps are usually fun, sometimes genuinely terrible, but sometimes contain a genuine pearl of statistical or geographical wisdom. If you have to follow someone on twitter, there are worse choices you could make.



You looking at me?

Statistics: helping you solve life’s more difficult problems…

You might have read recently – since it was in every news outlet here, here, here, here, here, here, and here for example – that recent research has shown that staring at seagulls inhibits them from stealing your food. This article even shows a couple of videos of how the experiment was conducted. The researcher placed a package of food some metres in front of her in the vicinity of a seagull. In one experiment she watched the bird and timed how long it took before it snatched the food. She then repeated the experiment, with the same seagull, but this time facing away from the seagull. Finally, she repeated this exercise with a number of different seagulls in different locations.

At the heart of the study is a statistical analysis, and there are several points about both the analysis itself and the way it was reported that are interesting from a wider statistical perspective:

  1. The experiment is a good example of a designed paired experiment. Some seagulls are more likely to take food than others regardless of whether they are being looked at or not. The experiment aims to control for this effect by using pairs of results from each seagull: one in which the seagull was stared at, the other where it was not. By using knowledge that the data are in pairs this way, the accuracy of the analysis is improved considerably. This makes it much more likely to identify a possible effect within the noisy data.
  2. To avoid the possibility that, for example, a seagull is more likely to take food quickly the second time, the order in which the pairs of experiments are applied is randomised for each seagull.
  3. Other factors are also controlled for in the analysis: the presence of other birds, the distance of the food, the presence of other people and so on.
  4. The original experiment involved 74 birds, but many were uncooperative and refused the food in one or other of the experiments. In the end the analysis is based on just 19 birds who took food both when being stared at and not. So even though results prove to be significant, it’s worth remembering that the sample on which results were based is very small.
  5. It used to be very difficult to verify the accuracy of a published statistical analysis. These days it’s almost standard for data and code to be published alongside the manuscript itself. This enables readers to both check the results and carry out their own alternative analyses. For this paper, which you can find in full here, the data and code are available here.
  6. If you look at the code it’s just a few lines from R. It’s notable that such a sophisticated analysis can be carried out with such simple code.
  7. At the risk of being pedantic, although most newspapers went with headlines like ‘Staring at seagulls is best way to stop them stealing your chips‘, that’s not really an accurate summary of the research at all. Clearly, a much better way to stop seagulls eating your food is not to eat in the vicinity of seagulls. (Doh!) But even aside from this nit-picking point, the research didn’t show that staring at seagulls stopped them ‘stealing your chips’. It showed that, on average, the seagulls that bother to steal your chips, do so more quickly when you are looking away. In other words, the headline should be:

If you insist on eating chips in the vicinity of seagulls, you’ll lose them quicker if you’re not looking at them

Guess that’s why I’m a statistician and not a journalist.

The issue of designed statistical experiments was something I also discussed in an earlier post. As I mentioned then, it’s an aspect of Statistics that, so far, hasn’t much been exploited in the context of sports modelling, where analyses tend to be based on historically collected data. But in the context of gambling, where different strategies for betting might be compared and contrasted, it’s likely to be a powerful approach. In that case, the issues of controlling for other variables – like the identity of the gambler or the stake size – and randomising to avoid biases will be equally important.