Santa Claus is coming to town

The substance of this post, including the terrible joke in the finale, is all stolen from here.

Look at this graph. The Santas represent points on the graph, and broadly show that the closer you get to Christmas, the more numerous the sightings of Santa. (Presumably in supermarkets and stores, rather than in grottos and sleighs, but you get the idea).

As discussed in previous posts – here, for example – we can measure the extent to which these two variables are related using the correlation coeffiecient. If the data lined up perfectly on an increasing straight line, the correlation would be 1. If the variables were completely unrelated, the correlation would be close to zero. (Unlikely to be exactly zero, due to random variation).

For the Santa data, the correlation is probably around 0.95. It’s not quite 1 for two reasons: first there’s a bit of noise around the general trend between the variables; second, the relationship itself looks slightly curved. But anyway, there’s a clear pattern to be observed: as Christmas approaches, the sightings of Santa increase. And this would manifest itself with a correlation coefficient close to 1.

What’s the effect of this relationship? Well, changing the time period before Christmas – say moving from a month before Christmas to a week before Christmas – will change the number of Santas you’re likely to see. But does it work the other way round? If we dressed a few extra people up as Santa, would it change the number of days left till Christmas? Clearly not. There’s a cause and effect between the two variables in the graphs, but it only works in one direction. The number of days left till Christmas affects the number of Santas you see on the street, but it simply doesn’t work the other way around.

Conclusion:

Correlation doesn’t imply Clausality!

Hohoho.


Footnote: the correct version of this phrase, ‘Correlation doesn’t imply Causality’, was the subject of an earlier post.

Just lucky

According to world chess champion Magnus Carlsen, the secret to his success is…

I’ve just been lucky.

Lucky? At chess?

Well, no, actually. This is Carlsen talking about his success at Fantasy Football. At the time of writing, Carlsen’s Premier League Fantasy Football team, Kjell Ankedal, is top of the League:

Top of the league sounds great, but this picture, which shows just the top 10 teams, is a little misleading. The Premier League Fantasy Football League actually has more than 6 million teams, and Kjell Ankedal is currently top of all of them. Moreover, Kjell Ankedal has finished in the top 5% of the league for the past 4 seasons, and in 2017-18 finished 2397th. Again, with 6 million teams the 2017-18 result would place Carlsen in the top  0.04%.

Obviously, football – and by corollary fantasy football –  is a game with many more sources of random intervention than chess, including the referee, the weather, VAR, the managers and just the inevitable chaos that can ensue from the physics of 22 people chasing, kicking and punching a ball. Compare that with the deterministic simplicity of a chess move such as e4.

And yet…

Can it be that Carlsen is ‘just lucky’ at Fantasy Football? Lucky to be top of the league after finishing in the top 5% or so, year after year? Well, we could make some assumptions about Carlsen actually being just an average player, and then work out the probability that he got the set of results he actually got, over this and recent seasons, if he was really just lucky rather than a very good player…

And it would be vanishingly small.

In his Ted Talk, Rasmus Ankersen says that the famous quote ‘The league table never lies’ should be replaced with ‘The league table always lies’. There’s simply too much randomness in football matches for a league table based on 38 matches or so per team to end up with a ranking of teams that reflects their exact ability. And yet, if you look at the top and bottom of most league tables there are very few surprises. League tables are noisy arrangements of teams ranked by their ability, but they are not just total chaos. Better teams generally do better than poorer teams, and teams are never champions or relegated just due to good or bad luck. So, to be in the top few percent of players, consistently over several seasons, with so many people playing is just implausible unless Carlsen is a much-better-than-average player.

So, while it’s true that Carlsen’s precise Fantasy Football ranking is affected to a greater extent by luck than is his world chess ranking, it’s probably a little disingenuous for him to say he’s just been lucky

And maybe it’s no coincidence that someone who’s eminently brilliant at chess turns out also to be eminently brilliant at fantasy football. Maybe one of the keys to Carlsen’s success at chess is an ability to optimise his strategy over the uncertainty in the moves his opponent will make.

Or maybe he’s just brilliant at everything he does.


Obviously, what applies to Carlsen with respect to Fantasy Football applies equally well to betting syndicates trading on football markets. Luck will play a large part in determining short term wins and losses, but in the very long term luck is ironed out, and what determines the success of the syndicate is their skill, judgement and strategy.

Friday the 13th

f13

Friday 13th. What could possibly go wrong today?

Well, according to people who suffer from Friggatriskaidekaphobia – the fear of Friday 13th – rather a lot. But is there any rationale for a fear of Friday 13th?

The scientific evidence is patchy. One study published in the British Medical Journal – ‘Is Friday the 13th bad for your health‘ – apparently found a 52% increase in hospital admissions from road accidents on Fridays that fell on the 13th of the month, compared with other Fridays.  However, one of the authors, Robert Luben, was subsequently quoted as saying:

It’s quite amusing and written with tongue firmly in cheek. It was written for the Christmas edition of the British Medical Journal, which usually carries fun or spoof articles.

I guess the authors looked at several possible statistics and reported the one that, by chance, fitted the hypothesis of Friday the 13th being unlucky. We’ve discussed this issue before: if you look at enough different phenomena where there is nothing of interest, some of them will look like there is something interesting happening just by chance. Statistics as a subject can be – and often is – badly misused this way,

Not everyone seemed to see it as a joke though. A follow-up study in the American Journal of Psychiatry titled ‘Traffic Deaths and Superstition on Friday the 13th‘  found a higher accident rate for women, but not men, on Fridays falling on the 13th of the month. This was subsequently contested by another group of researchers who published an article in the Journal BMC Public Health magazine titled ‘Females do not have more injury road accidents on Friday the 13th‘. Who to believe?

So, it’s a mixed bag. Moreover, as reported in Wikipedia – which gives an interesting history of the origins of the superstitions associated with Friday 13th – road accidents, in the Netherlands at least, are less frequent on Friday 13th, arguably because people take more care than usual. But even there I’d be cautious about the results without having a detailed look at the way the statistical analysis was carried out.

And anyway, Tuesday 8th is the new Friday 13th. You’ve been warned.


Footnote: I’m writing this on Thursday 12th, blissfully unaware of whatever horrors this particular Friday 13th will bring.

Cube-shaped poo

Do you like pizza? If so, I’ve got good and bad news for you.

The good news is that the 2019 Ig Noble prize winner in the category of medicine is Silvano Gallus, who received the award for…

… collecting evidence that pizza might protect against illness and death…

The bad news, for most of you, is that this applies…

…if the pizza is made and eaten in Italy.

Obviously, it’s a bit surprising that pizza can be considered a health food. But if you accept that, it’s also a bit surprising that it has to be Italian pizza. So, what’s going on?

The Ig Nobel prizes are a satirical version of the Nobel prizes. Here’s the Wikipedia description:

The Ig Nobel Prize (/ˌɪɡnˈbɛl/ IG-noh-BEL) is a satiric prize awarded annually since 1991 to celebrate ten unusual or trivial achievements in scientific research, its stated aim being to “honor achievements that first make people laugh, and then make them think.” The name of the award is a pun on the Nobel Prize, which it parodies, and the word ignoble.

As such, the prize is awarded for genuine scientific research, but for areas of research that are largely incidental to human progress and understanding of the universe. For example, this year’s prize in the field of physics went to a group of scientists for…

…studying how, and why, wombats make cube-shaped poo.

It’s in this context that Silvano Gallus won his award. But although the Ig Noble award says something about the irrelevance of the subject matter, it’s not intended as a criticism of the quality of the underlying research. Gallus’s work with various co-authors (all Italian) was published as an academic paper ‘Does Pizza Protect Against Cancer‘ in the International Journal of Cancer. This wouldn’t happen if the work didn’t have scientific merit.

Despite this, there are reasons to be cautious about the conclusions of the study. The research is based on a type of statistical experimental design known as a case-control study. This works as follows. Suppose, for argument’s sake, you’re interested in testing the effect of pizzas on the prevention of certain types of disease. You first identify a group of patients having the disease and ask them about their pizza-eating habits. You then also find a group of people who don’t have the disease and ask them about their pizza-eating habits. You then check whether the pizza habits are different in the two groups.

Actually, it’s a little more complicated than that. It might be that age or gender or something else is also different in the two groups, so you also need to correct for these effects as well. But the principle is essentially just to see whether the tendency to eat pizza is greater in the control group – if so, you conclude that pizza is beneficial for the prevention of the specified disease. And on this basis, for a number of different cancer-types, Silvano Gallus and his co-authors found the proportion of people eating pizzas occasionally or regularly to be higher in the control group than in the case group.

Case-control studies are widely used in medical and epidemiological studies because they are quick and easy to implement. The more rigorous ‘randomised control study’ would work as follows:

  1. You recruit a number of people for the study, none of whom have the disease of interest;
  2. You randomise them into two groups. One of the groups will be required to eat pizza on a regular basis; the other will not be allowed to eat pizza;
  3. You follow the 2 groups over a number of years and identify whether the rate of disease turns out to be lower in the pizza-eating group rather than the non-pizza-eating group;
  4. Again, you may want to correct for other differences in the 2 groups (though the need for this is largely eliminated by the randomisation process).

Clearly, for both logistic and time reasons, a randomised control study is completely unrealistic for studying the effects of pizza on disease prevention. However, in terms of reliability of results, case control studies are generally inferior to randomised control studies because of the potential for bias.

In case control studies the selection of the control group is extremely important, and it might be very easy to fall into the trap of inadvertently selecting people with an unusually high rate of eating pizzas. (If, for example, you surveyed people while standing outside a pizzeria). It’s also easy – by accident or design – for the researcher to get the answer they might want when asking a question. For example: “you eat a lot of pizza, don’t you?” might get a different response from “would you describe yourself as a regular pizza eater?”. Moreover, people simply might not have an accurate interpretation of their long-term eating habits. But most importantly, you are asking people with, for example, cancer of the colon whether they are regular pizza eaters. Quite plausibly this type of disease has quite a big effect on diet, and one can well imagine that pizzas are not advised by doctors. So although the pizza-eating question is probably intended to relate to the period prior to getting the disease, it’s possible that people with the disease are no longer tending to eat pizza, and respond accordingly.

Finally, even if biases are eliminated by careful execution of the study, there’s the possibility that the result is anyway misleading. It may be that although pizzas seem to give disease protection, it’s not the pizza itself that’s providing the protection, but something else that is associated with pizza eating. For example, regular pizza eating might just be an indicator of someone who simply has regular meals, which may be the genuine source of disease protection. There’s also the possibility that while the rates of pizza eating are lower among the individuals with the specified diseases, they are much higher among individuals with other diseases (heart problems, for example). This could have been identified in a randomised control study, but flies completely under the radar in a case-control study.

So, case-control studies are a bit of a minefield, with various potential sources of misleading results, and I would remain cautious about the life-saving effects of eating pizza.

And finally… like all statistical analysis, any conclusions made on the basis of sample results are only relevant to the wider population from which that sample was drawn. And since this study was based on Italians eating Italian pizzas, the authors conclude…

Extension of the apparently favorable effect of pizza on cancer risk in Italy to other types of diets and populations is therefore not warranted.

So, fill your boots at Domino’s Pizzas, but don’t rely on the fact that this will do much in the way of disease prevention.

 

 

Monkey business

Stick a monkey on a typewriter, let him hit keys all day, and what will you get? Gibberish, probably. But what if you’re prepared to wait longer than a day? Much longer than a day. Infinitely long, say. In that case, the monkey will produce the complete works of Shakespeare. And indeed any and every other work of literature that’s ever been written.

This is from Wikipedia:

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, such as the complete works of William Shakespeare.

Infinity is a tricky but important concept in mathematics generally. We saw the appearance of infinity in a recent post, where we looked at the infinite sequence of numbers

1, 1/2, 1/4, 1/8,….

and asked what their sum would be. And it turned out to be 2. In practice, you can never really add infinitely many numbers, but you can add more and more terms in the sequence, and the more you add the closer you will get to 2. Moreover, you can get as close to 2 as you like by adding sufficiently many terms in the sequence. It’s in this sense that the sum of the infinite sequence is 2.

In Statistics the concept of infinity and infinite sums is equally important, as we’ll discuss in a future post. But meantime… the infinite monkey theorem. What this basically says is that if something can happen in an experiment, and you repeat that experiment often enough, then eventually it will happen.

Sort of. There’s still a possibility that it won’t – the monkey could, by chance, just keep hitting the letter ‘a’ totally at random forever, for example – but that possibility has zero probability. That’s the ‘almost surely’ bit in the Wikipedia definition. On the other hand, with probability 1 – which is to say complete certainty – the monkey will eventually produce the complete works of Shakespeare.

Let’s look at the calculations, which are very similar to those in another recent post.

There are roughly 50 keys on a keyboard, so assuming the monkey is just hitting keys at random, the probability that the first key stroke matches the first letter of Shakespeare’s works is 1/50. Similarly, the probability the second letter matches is also 1/50. So to get the first two matching it’s

1/50 \times 1/50

Our monkey keeps hitting keys and at each new key stroke, the probability that the match-up continues is multiplied by 1/50. This probability gets small very, very quickly. But it never gets to zero.

Now, if the monkey has to hit N keys to have produced a text as long as the works of Shakespeare, by this argument he’ll get a perfect match with probability

p=(1/50)^N

This will be a phenomenally small number. Virtually zero. But, crucially, not zero. Because if our tireless monkey repeats that exercise a large number of times, let’s say M times, then the probability he’ll produces Shakespeare’s works at least once is

Q =   1-(1-p)^M

And since p is bigger than zero – albeit only slightly bigger than zero –  then Q gets bigger with N. And just as the sum of the numbers 1, 1/2, 1/4, … gets closer and closer to 2 as the number of terms increases, so Q can be made as close to 1 as we like by choosing M large enough.

Loosely speaking, when M is infinity, the probability is 1. And even more loosely: given an infinite amount of time our monkey is bound to produce the complete works of Shakespeare.


Obviously, both the monkey and the works of Shakespeare are just metaphors, and the idea has been expressed in many different forms in popular culture.  Here’s Eminem’s take on it, for example:

 

The China syndrome

In a couple of earlier posts I’ve mentioned how statistical analyses have sometimes been used to demonstrate that results in published analyses are ‘too good to be true’. One of these cases concerned Mendel’s laws of genetic inheritance. Though the laws have subsequently been shown to be unquestionably true, Mendel’s results on pea experiments were insufficiently random to be credible. The evidence strongly suggests that Mendel tweaked his results to fit the laws he believed to be true. He just didn’t understand enough about statistics to realise that the very laws he wanted to establish also implied sizeable random variation around predicted results, and the values he reported were much too close to the predicted values to be plausible.

As discussed in a recent academic article, a similar issue has been discovered in respect of official Chinese figures for organ donation. China has recently come under increasing international pressure to discontinue its practice of using organs of dead prisoners for transplants. One issue was consent – did prisoners consent to the use of their organs before their death? But a more serious issue was with respect to possible corruption and even the possibility that  some prisoners were executed specifically to make their organs available.

Anyway, since 2010 China has made efforts to discontinue this practice, replacing it with a national system of voluntary organ donation. Moreover, they announced that from 2015 onwards only hospital-based voluntary organ donations would be used for transplants.  And as evidence of the success of this program, two widely available datasets published respectively by the China Organ Transplant Response System (COTRS)  and the Red Cross Society of China, show rapid growth in the numbers of voluntary organ donations, which would more than compensate for the cessation of the practice of donations from prisoners.

Some of the yearly data counts from the COTRS database are shown in this figure taken from the report references above. The actual data are shown by points (or triangles and crosses); the curves have been artificially added to show the general trend in the observed data. Clearly, for each of the count types, one can observe a rapid growth rate in the number of donations.

But… here’s the thing… look at how closely the smooth curves approximate the data values. The fit is almost perfect for each of the curves. And there’s a similar phenomenon for other data, including the Red Cross data. But when similar relationships are looked at for data from other countries, something different happens: the trend is generally upwards, as in this figure, but the data are much more variable around the trend curve.

In summary, it seems much more likely that the curves have been chosen, and the data chosen subsequently to fit very closely to the curves. But just like Mendel’s pea data, this has been done without a proper awareness that nature is bound to lead to substantial variations around an underlying law. However, unlike Mendel, who presumably just invented numbers to take shortcuts to establish a law that was true, the suspicion remains that neither the data nor the law are valid in the case of the Chinese organ donation numbers.


A small technical point for those of you that might be interested in such things. The quadratic curves in the above plot were fitted in the report by the method of simple least squares, which aims to find the quadratic curve which minimises the overall distance between the points and the curve. As a point of principle, I’d argue this is not very sensible. When the counts are bigger, one would expect to get more variation, so we’d probably want to downweight the value of the variation for large counts, and increase it for the lower counts. In other words, we’d expect the curve to fit better in the early years and worse in the later years, and we should take that into account when fitting the curve. In practice, the variations around the curves are so small, the results obtained by doing things this way are likely to be almost identical. So, it’s just a point of principle more than anything else. But still, in an academic paper which purports to use the best available statistics to discredit the claim made by a national government, it would probably be best to make sure you really are using the most appropriate statistical methods for the analysis.

Black Friday

Had you heard today is Black Friday. Or have you been living as a hermit in a cave without phone or access to emails for the last couple of weeks or so?

Like Cyber Monday, Green Monday and Giving Tuesday, Black Friday is a retail event imported from the United States, where it is timed to coincide with the Thanksgiving national holiday period. Sadly, here in the UK, we don’t get the holiday, but we do get the pleasure of a day – which often extends to at least a couple of weeks – indulging ourselves with the luxury of purchasing goods that we probably don’t need at prices that are well below the usual retail price.

Or do we?

The consumer group Which monitored the prices of 83 products that were offered for sale during 2018’s Black Friday event and found:

  • 95% of the products were available at the same price or cheaper in the 6 months following Black Friday;
  • 61% of the products had been available at the same price or cheaper in the 6 months prior to Black Friday;
  • Just 5% of the products were genuinely at their cheapest on Black Friday compared to the surrounding 12-month period.

Obviously 83 products is not a huge sample size, especially since different retailers are likely to have a different pricing strategy, so you shouldn’t read too much into the exact numbers. But the message is clear and probably ties in with your own experience of  the way retailers manipulate shoppers’ expectations during ‘sales’.

Anyway, a fun statistical analysis of various aspects of Black Friday can be found here. I’m not sure how reliable any of the analyses are, especially in light of the Which results, but an example is given in the following figure. This shows – apparently – the sales growth per country on Black Friday compared to a regular Friday.

Now, I don’t know if it’s the number of items sold, the money spent, or something else, but in any case Pakistan supposedly has a retail rate that’s 11525% of a normal Friday rate. That’s to say a sales increase factor of 115. In Italy the factor is 45 and even in the UK the usual Friday rate is multiplied by 15. Impressive if true.

But I’m personally more impressed by Thailand who doggedly spend less than half of a normal Friday’s expenditure on Black Friday. Of course, we can’t tell from these data whether this is due to a genuine resistance to Black Friday, or whether Thailand has a strong seasonal variation in sales such that this time of the year is naturally a period of low sales.


Finally, if you want to empathise with Thailand, you could yourself participate in Buy Nothing Day, intentionally held on the same day as Black Friday. It probably doesn’t need much in the way of explanation, but just in case, here’s the tagline from the webpage:

Lock up your wallets and purses, cut up your credit cards and dump the love of your life – shopping!

It’s time to celebrate Buy Nothing Day!

Maybe someone should pass the message on to Pakistan.

At The Intersection

You’ll remember Venn diagrams from school. They’re essentially a mathematical tool for laying out the information in partially overlapping sets. And in statistics they are often used in the same way for showing the possible outcomes in events which might overlap.

For example, here’s a Venn diagram showing the relationship between whales and fish:

Whales and fish have some properties that are unique, but they also have some features in common. These are all shown in the appropriate parts of the diagram, with the common elements falling in the part of the sets that overlap – the so-called intersection.

With this in mind, I recently came across the following Venn poem titled ‘At the Intersection’ written by Brian Bilston:

You can probably work it out. There are three poems in total:  separate ones for ‘him’ and ‘her’ and their intersection. Life seen from two different perspectives, the result of which is contained in the intersection.

Genius.

 

Juvenile dinosaurs

This blog is mostly about Statistics as a science rather than statistics as numbers. But just occasionally the statistics themselves are so shocking, they’re worthy of a mention.

With this in mind I was struck by two statistics of a similar theme in the following tweet from Ben Goldacre (author of the Bad Science website and book):

 

Moreover, in the discussion following Ben’s tweet, someone linked to the following cartoon figure:

This shows that even if you change the way of measuring distance from time to either phylogenetic distance or physical similarity, the following holds: the distance between a sparrow and T-Rex is smaller than that between T-Rex and Stegosaurus.


Footnote 1: this is more than a joke. Recent research makes the case that there is a strong evolutionary link between birds and dinosaurs. As one of the authors writes:

We now understand the relationship between birds and dinosaurs that much better, and we can say that, when we look at birds, we are actually looking at juvenile dinosaurs.

Footnote 2. Continuing the series (also taken from the discussion of Ben’s tweet)… Cleopatra is closer in time to the construction of the space shuttle than the pyramids.

Footnote 3. Ben Goldacre’s book, Bad Science, is a great read. It includes many examples of the way science – and Statistics – can be misused.

 

Problem solved

A while back I set a puzzle asking you to try to remove three coins from a red square region as shown in the following diagram.

The only rule of the game is that when a coin is removed it is replaced with 2 coins – 1 immediately to the right of and one immediately below the coin that is removed. If there is no space for adding these replacement coins, the coin cannot be removed.

The puzzle actually appeared in a recent edition of Alex Bellos’ Guardian mathematics puzzles, though it was created by the Argentinian mathematician Carlos Sarraute. This is his solution which is astonishing for its breathtaking ingenuity.

The solution starts by giving a value to every square in the grid as follows:

Remember, the grid goes on forever both to the right and downwards. The top left hand box has value 1. Going right from there, every subsequent square has value equal to 1/2 of the previous one. So: 1, 1/2, 1/4, 1/8 and so on. The first column is identical to the first row. To complete the second row, we start with the first value, 1/2, and again just keep multiplying by 1/2. The second column is the same as the second row. And we fill the entire grid this same way. Technically, every row and column is a series of geometric numbers: consecutive multiples of a common number, which in this case is 1/2.

Let’s define the value of a coin to be the value of the square its on. Then the total value of the coins at the start of the game is

1 + \frac{1}{2} + \frac{1}{2}= 2

Now…

  • When we remove a coin we replace it with two coins, one immediately to the left and one immediately to the right. But if you look at the value any square on the grid, it is equal to the sum of the values of the squares immediately below and to the right. So when we remove a coin we replace it with two coins whose total value is the same. It follows that the total value of the coins stays unchanged however many moves we make. It will always be 2 however many moves we make.
  • This is the only tricky mathematical part. Look at the first row of numbers. It consists of 1, 1/2, 1/4, 1/8… and goes on forever. But even though this is an infinite sequence it has a finite sum of 2. Obviously, we can never really add infinitely many numbers in practice, but by adding more and more terms in the series we will get closer and closer to the value of 2. Try it on a calculator. In summary:

1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} +\ldots  = 2.

  • Working down the rows, the second row is the same as the first with the first term removed. So its sum must be 1. The third is the same as the second with the first term of 1/2 removed, so its sum is 1/2. By the same reasoning, the sum of the fourth row will be 1/4, the fifth row 1/8 and so on.
  • So, the row sums are respectively 2, 1, 1/2, 1/4, …. This is the same as the values of the first row with the additional first term of 2. It follows that the sum of the row sums, and therefore the sum of all numbers in the grid is 2+2=4. Again, we can’t add all the numbers in the practice, but we will get closer and closer to the value of 4 by adding more and more squares.
  • The total value of the squares inside the red square is 1 + 1/2 + 1/2 + 1/4 = 9/4. The total value outside this region must therefore be 2-9/4= 7/4.
  • Putting all this together, the initial value of the coins was 2. After any number of moves, the total value of all coins will always remain 2. But the total value of all squares outside the red square is only 7/4. It must therefore be impossible to remove the three coins from the red square because to do so would require the coins outside of this area to have a value of 2, which is greater than the total value available in the entire region.

I find this argument quite brilliant. My instincts were to try to solve the puzzle using arguments from geometry. I failed. It would never have occurred to me to try to develop a solution based on the properties of numbers.


As I wrote in the original post, this puzzle doesn’t really have any direct relevant to Statistics except in so much as it shows the power and beauty of mathematical proof, which is an essential part of statistical theory. Having said that, the idea of infinite limits is important in Statistics, and I’ll discuss this in a further post. Let me just mention though that summing infinite series as in the solution above is a delicate issue for at least two reasons:

  1. Although the sum 1 + 1/2 + 1/4 + 1/8 + …. has a finite sum of 2, this series 1 + 1/2 + 1/3 + 1/4 + 1/5 + …. has no finite sum. The sum grows very slowly, but as I take more and more numbers in the series, the sum grows without any limit. That’s to say, if you give me any number – say 1 million – I can always find enough terms in the series for the sum to be greater than that number.
  2. To get the total value of the grid, we first added the rows and then added these row sums across the columns. We could alternatively have first added the columns, and then added these columns sums across the rows and we’d have got the same answer. For this example both these alternatives are valid. But in general this interchange of row and column sums to get the total sum is not valid. Consider, for example, this infinite grid:

 The first row sums to 2, after which all other rows sum to zero. So, the sum of the row sums is 2. But looking at the columns, even column sums to zero. So, if we sum the columns and then sum these sums we get 0. This couldn’t possibly happen with finite grids, but infinite grids require a lot more care.

In a follow-up post we’ll consider limits of sums in the context of Statistics.


Finally, I’m grateful to Fabian.Thut@smartodds.co.uk for some follow-up discussion on the original post. In particular, we ended up discussing the following variation on the original puzzles. The rules are exactly the same as before, but the starting configuration of the coins is now as per the following diagram:

In this case, can the puzzle be solved? Does the argument presented for the original problem help in any way?

If you have any thoughts about this, please do write to me. In any case, I’ll write another post with the solution to this version shortly.