Getting high

Speed climbing is what is says on the tin: climbing at speed. The objective is to climb a standard wall with a height of 15 metres as quickly as possible. Speed climbing is actually one of three disciplines – the others being ‘bouldering’ and ‘lead’ –  that together comprise Sport Climbing. This combined category will be included as an Olympic sport for the first time in Tokyo, 2020.

The history of Sport Climbing is relatively brief. It seems to have developed from Sportroccia, which was the first international competition for climbers held in different locations in Italy from 1985 to 1989. This led to the first World Championships in Frankfurt in 1991, since which there has been a Sport Climbing World Championship event held every two years.

The inclusion of speed climbing as one of the disciplines in Sport Climbing has always been controversial. Many climbers regard the techniques required to climb at speed to be at odds with the skills that are needed for genuine outdoor climbs. Like in the picture at the header of this post.

The controversy is such that even though Sport Climbing will be in the Olympics for the first time in 2020, a new format is being proposed for the 2024 Olympics in which Speed Climbing is separated as a discipline from the other two categories.

Anyway, leaving the controversy aside, climbing 15 metres doesn’t sound too daunting until you look at a picture of what it entails…

For experienced climbers a wall like this isn’t particularly challenging, but speed climbers have the additional task of competing against both an opponent – who is simultaneously completing an identical course – and the clock. The current world records are 5.48 seconds for men and 6.995 seconds for women. Just to put that in perspective: the men’s record corresponds to a speed of almost 10 km per hour. Vertically. With not much to hold onto.

The women’s world record was actually set very recently by the Indonesian female climber Aries Susanti Rahayu – nicknamed Spiderwoman. You can see her record breaking climb here.

The men’s world record is held by by Iranian climber Reza Alipourshenazandifar in 2017. (Performance here.)

Like my recent discussion about marathon times, what’s interesting about speed climbing from a statistical point of view is trying to assess what the fastest possible climb time might be.

The following graphs shows how the records have fallen over time for both men and women.

Though irregular, you could convince yourself that the pattern for women’s records is approximately following a straight line. On the other hand, notwithstanding the lack of data, the pattern for men seems more like a curve that could be levelling off. These two observations aren’t mutually consistent though, as they would suggest that not too far into the future the women’s record will be faster than the men’s, which is implausible – though not impossible – for biological reasons.

This illustrates a number of difficulties with statistical modelling in this type of context:

  1. We have very few data to work with;
  2. To predict forwards we need to assume some basic pattern for the data, but the choice of pattern – say linear or curved – is likely itself to affect how results extrapolate into the future;
  3. Separate extrapolations for women and men might lead to incompatible results;
  4. As also discussed in the context of predicting ultimate marathon times, an extrapolation based just on numbers ignores the underlying physics and biology which ultimately determines what the limits of human capacity are.

Maybe have a look at the data yourselves and write to me if you have ideas about what the ultimate times for both men and women might be. I’ll post any suggestions and perhaps even add ideas of my own in a future post.

Relatively speaking

Last week, when discussing Kipchoge’s recent sub 2-hour marathon run, I showed the following figure which compares histograms of marathon race times in a large database of male and female runners.

I mentioned then that I’d update the post to discuss the other unusual shape of the histograms. The point I intended to make concerns the irregularity of the graphs. In particular, there are many spikes, especially before the 3, 3.5 and 4 hour marks. Moreover, there is a very large drop in the histograms – most noticeably for men – after the 4 hour mark.

This type of behaviour is unusual in random processes:. frequency diagrams of this type, especially those  based on human characteristics, are generally much smoother. Naturally, with any sample data, some degree of irregularity in frequency data is inevitable, but:

  1. These graphs are based on a very large sample of more than 3 million runners, so random variations are likely to be very small;
  2. Though irregular in shape, the timings of the irregularities are themselves regular.

So, what’s going on?

The irregularities are actually a consequence of the psychology of marathon runners attempting to achieve personal targets. For example, many ‘average’ runners will set a race time target of 4 hours. Then, either through a programmed training regime or sheer force of will on the day of the race, will push themselves to achieve this race time. Most likely not by much, but enough to be on the left side of the 4-hour mark.

The net effect of many runners behaving similarly is to cause a surge of race times just before the 4-hour mark and a dip thereafter. There’s a similar effect at 3 and 3.5 hours – albeit of a slightly smaller magnitude – and smaller effects still at what seem to be around 10 minute intervals. So, the spikes in the histograms are due to runners consciously adapting their running pace to meet self-set objectives which are typically at regular times like 3, 3.5, 4 hours and so on.

Thanks to those of you that wrote to me to explain this effect.

Actually though, since writing the original post, something else occurred to me about this figure, which is why I decided to write this separate post instead of just updating the original one. Take a look at the right hand side of the plot – perhaps from a finish time of around 5 hours onwards. The values of the histograms are pretty much the same for men and women in this region. This contrasts sharply with the left side of the diagram where there are many more men than women finishing the race in, say, less than 3 hours. So, does this mean that although at faster race times there are many more men than women, at slow race times there are just as many women as men?

Well, yes and no. In absolute terms, yes: there are pretty much the same number of men as women completing the race with a time of around 6 hours. But… this ignores the fact that there are actually many more men than women overall – one of the other graphics on the page from which I copied the histograms states that the male:female split in the database is 61.8% to 31.2%. So, although the absolute numbers of men race times is similar to that of women, the proportion of runners that represents is considerably lower compared to women.

Arguably, comparing histograms gives a misleading representation of the data. It makes it look as though men and women are equally likely to have a race time of around 6 hours. Though true, this is only because many more men than women run the marathon.  The proportion of men completing the race with a time of around 6 hours is considerably smaller than that of women.

The same principle holds at all race times but is less of an issue when interpreting the graph. For example, the difference in proportions of men and women having a race time of around 4 hours is smaller than that of the actual frequencies in the histograms above, but it is still a big difference. It’s really where the absolute frequencies are similar that the picture above can be misleading.

In summary: there is a choice when drawing histograms of using absolute or relative frequencies. (Or counts and percentages). When looking at a single histogram it makes little difference – the shape of the histogram will be identical in both cases. When comparing two or more sets of results, histograms based on relative frequencies are generally easier to interpret. But in any case, when interpreting any statistical diagram, always look at the fine detail provided in the descriptions on the axes so as to be sure what you’re looking at.


Footnote:

Some general discussion and advice on drawing histograms can be found here.

No human is limited

Do you run a bit? If so, chances are you can run 100 metres in 17 seconds. Which puts you in the same class as the Kenyan marathon runner Eliud Kipchoge.

Just one small catch: you have to keep that pace going for 2 hours.

In an earlier post  I discussed how Kipchoge had made an attempt at a sub-2-hour marathon in Monza, Italy, but failed. Just. Well, as you probably know, this weekend he successfully repeated the attempt this weekend in Vienna, beating the 2-hour milestone by almost 20 seconds.

The theme of that earlier post was whether Statistics could be used to predict ultimate performance times: what is the fastest time possible for any human to run 26.2 miles? There must be some limit, but can we use data to predict what it will be? I included this graph in the previous post to make the point:

This graphic is actually unchanged despite Kipchoge’s Vienna run because, as in Italy, the standard IAAF conditions were not met. In particular:

  1. Kipchoge was supported by a rotating team of 41 pace runners who, as well as setting the pace, formed an effective windshield;
  2. A pace car equipped with a laser beam was used to point to the ideal running point for Kipchoge on the road.

So, we can’t add Kipchoge’s 1:59:40 to the graphic. But, his race time demonstrates that 2 hours is not a physical barrier, and one might guess that it’s just a matter of time before a 2-hour marathon is achieved under official IAAF conditions. Probably by Kipchoge.

Other things were also designed to maximise Kipchoge’s performance:

  1. The race circuit was completely flat;
  2. Kipchoge was wearing specially designed shoes (provided by Nike) that are estimated to improve his running economy by 7-8%.
  3. His drinks were provided by a support team on bicycles to avoid him having to slow down to collect refreshments.
  4. The event was sponsored by Ineos, a multibillion dollar chemical company (with a dodgy environmental record.)

Nonetheless: what an astonishing achievement!

Undoubtedly there is a limit to what’s humanly possible for a marathon race time, but records will almost certainly  continue to be broken as the limit is approached in smaller and smaller increments. However, as discussed in the original post, Statistics is unlikely to provide accurate answers to what that limit will be. An analysis of the available data in 1980 would most likely have suggested an ultimate limit somewhere above 2 hours. But seeing the more recent data, and knowing what happened at the weekend, it seems likely that this threshold will be eventually broken in an official race sometime.

This is a bit misleading though. What we’ve discussed so far is extrapolating the data in the graph above without taking their context into account. Yet the data do have a context, and this suggests that, above and beyond improvements in training regimes and running equipment, the ultimate limit will be determined by the boundaries of human physiology. And this implies that biological and physical rules will apply. Indeed, research published in 1985 suggested an absolute limit for the marathon of 1:57:58. This research comprised of a statistical analysis, but in combination with models of human consumption of oxygen rates for energy conversion. Who knows if this prediction will stand the test of time or not, but the fact that it is based on an analysis which combines Statistics with the relevant Science suggests that it is more reliable than an extrapolation based solely on abstract numbers.


Footnote 1:

An article in the Observer on Sunday described Kipchoge’s Vienna run in a similar context, discussing the limits that there might be on human sporting achievements. It also listed a number of  long-standing sporting records, including Paula Radcliffe’s record women’s marathon time of 2:15:25, which was made in 2003. By Sunday afternoon that record was smashed by a margin of 81 seconds by the Kenyan runner Brigid Kosgei.


Footnote 2:

For most people running marathons, the 2-hour threshold is, let’s say, not especially relevant. Some general statistics on marathon performance from a database of more than 3 million runners is available here.

It includes the following histogram of race times, which I found interesting. Actually it’s 2 histograms, one in blue (for women) superimposed on that in red (for men).

Both histograms have unusual shapes which seem to tell us something about marathon runners. Can you explain what?

I’ll update this post with my own thoughts in a week or so.

The 10-minute marathon challenge

Not content with having recently won the London marathon for the fourth time in a record time of 2:02:37, the phenomenal Kenyan athlete Eliud Kipchoge has announced a new bid to run the marathon distance in under two hours. The time Kipchoge set in the London marathon was already the second fastest in history and Kipchoge also holds the record for the fastest ever marathon, at 2:01:39, made in Berlin in 2018. But the sub- 2 hour marathon remains an elusive goal.

In 2016 Nike sponsored an attempt to break the 2-hour target. Three elite runners, including Kipchoge, trained privately to run a marathon-length distance in circuits around the Monza racetrack in Italy. Kipchoge won the race, but in a time of 2:00:25, therefore failing by 25 seconds to hit the 2-hour target. The specialised conditions for this attempt, including the use of relay teams of pace setters, meant that the race fell outside of IAAF standards, and therefore the 2:00:25 is not valid as a world record. Kipchoge’s planned attempt in London will also be made under non-standard conditions, so whatever time he achieves will also not be considered as valid in respect of IAAF rules. Regardless of this, beating the 2-hour barrier would represent a remarkable feat of human achievement, and this is Kipchoge’s goal.

But this begs the question: is a sub- 2 hour marathon under approved IAAF standards plausible? The following graphic shows how the marathon record has improved in the last 100 years or so, from Johnny Hayes’ record of 2:55:18 in 1908, right up to Kipchoge’s Berlin record.

Clearly there’s a law of diminishing returns in operation: the very substantial improvements in the first half of the graph are replaced by much smaller incremental improvements in the second half. This is perfectly natural: simple changes in training regimes and running equipment initially enabled substantial advances; more recent changes are much more subtle, and result in only in marginal improvements. So, the shape of the graph is no surprise. But if you were extrapolating forward to what might happen in the next 10, 100 or 1000 years, would your curve go below the 2-hour threshold or not?

Actually, it’s straightforward to take a set of data, like those contained in the graphic above, and fit a nice smooth curve that does a reasonable job at describing the overall pattern of the graph. And we could then extrapolate that curve into the future and see whether it goes below 2 hours or not. And if it does, we will even have a prediction of when it does.

But there’s a difficulty – the question of whether the solution crosses the 2-hour threshold or not is likely to depend very much on the type of curve we use to do the smoothing. For example, we might decide that the above graphic is best broken down into sections where the pattern has stayed fairly similar. In particular, the most recent section from around 1998 to 2018 looks reasonably linear, so we might extrapolate forward on that basis, in which case the 2-hour threshold is bound to be broken, and pretty soon too. On the other hand we might decide that the whole period of data is best described by a kind of ‘ell’-shaped curve which decreases to a lower horizontal limit. And then the question will be whether that limit is above or below 2 hours. In both cases the data will determine the details of the curve – the gradient of the straight line, for example, or the limit of the ‘ell’-shaped curve – but the form of the graph – linear, ‘ell’-shaped or something else – is likely to be made on more subjective grounds. And yet that choice will possibly determine whether the 2-hour threshold is predicted to be beaten or not.

There’s no way round this difficulty, though statistical techniques have been used to try to tackle it more rigorously. As I mentioned in a previous post, since athletics times are fastest times – whether it’s the fastest time in a race, or the fastest time ever when setting a record – it’s natural to base analyses on so-called extreme value models, which are mathematically suited to this type of process. But this still doesn’t resolve the problem of how to choose the curve which best represents the trend seen in the above picture. And the results aren’t terribly reliable. For example, in an academic paper ‘Records in Athletics Through Extreme-Value Theory‘ written in 2008 the authors John Einmahl and Jan Magnus predicted the absolute threshold times or distances (in case of field events) for a number of athletic events. At the time of writing their paper the world record for the marathon was 2:04:26, and they predicted a best possible time of 2:04:06.

History, of course, proved this to be completely wrong. To be fair to the authors though, they gave a standard error on their estimate of 57 minutes. Without going into detail, the standard error is a measure of how accurate the authors think their best answer is likely to be, and one rule of thumb interpretation of the standard error is that if you give a best answer and a standard error, then you’re almost certain the true value lies within 2 standard errors of your best answer. So, in this case, the authors were giving a best estimate of 2:04:06, but – rather unhelpfully – saying the answer could be as much as 114 minutes faster than that, taking us down to a marathon race time of 0:10:06.

So, come on Kipchoge, never mind the 2-hour marathon, let’s see if you’re up to the 10-minute marathon challenge.


Footnote: don’t trust everything you read in statistical publications. (Except in this blog, of course 😉).