Evaluation – March Update – “The Model vs Lawro vs Merse” – Model holds on to narrow Lead

The Model is holding on to a slim lead over Lawro an Merson, but the “experts” are closing in. Can the computer hang on to the end of the season?

The table below shows the recent forecasting performance of “The Model vs Lawro vs Merse”, as well as the cumulative performance all season, as of 10am 16/3/2019, according to the BBC Sport scoring metric (40 points for a perfect scoreline forecast, 10 points for a correct result only).

Note: Lawro achieved 40 points more in round 27 than recorded on the BBC website as we include a midweek forecast he made for a game rearranged for the Carabao Cup final.

Scorelines Results “Lawro” pts
Model Lawro Merson
Total 32/299 164/299 2600 2550 2540
30: 9-10 Mar 0/10 6/10 60 30 70
29: 2-3 Mar 1/10 7/10 100 100 100
28: 26-27 Feb 0/10 8/10 80 60 100
27: 6-24 Feb 2/9 5/9 110 150 90
26: 9-11 Feb 2/10 6/10 120 140 100
Rounds 1/25 27/250 132/250 2130 2070 2080

Half-term Reports

We can always consult the league tables to find out where teams are – but are they where we expected them to be? It’s hard to avoid the idea that some teams are doing better than expected (Liverpool? Leeds?), and others are performing much worse than expected (Burnley, Millwall, Notts County). Shouldn’t tables reflect that?

Elo rankings offer such an opportunity. They give each competitor (team, individual) a strength score, from which a prediction can be generated for each match that lies between 0 and 1. In football, 0 would be an away win, 1 would be a home win, and 0.5 would represent a perfectly evenly matched event. So if our outcome variable is 0 for an away win, 0.5 for a draw, and 1 for a home win.

So we can look at every match so far this season, and see whether or not teams have exceeded expectations. If a home team had an Elo prediction of 0.45 and won, then the outcome is 1, and the Elo improvement is 1-0.45, hence positive. If on the other hand the Elo prediction was 0.7 and the visiting team won, then the home team’s adjustment would be 0-0.7, hence negative.

So teams that have consistently exceeded expectations would have a sum of adjustments that are positive, but a team that has under performed would have a sum of adjustments that is negative.

In the featured image for this post, we plot the story so far for the Premier League. Watford’s strong start is clear, and Liverpool’s seemingly relentless journey to the summit of the table on Christmas Day is there. However, up until two games ago, when Arsenal lost at Southampton, Arsenal were actually the most improve team.

Champions Manchester City are ninth placed in this table, falling from third on the eve of their match at Chelsea.

At the bottom, things are not surprising. Fulham, despite their summer investment, have struggled badly, as have Burnley following their success last season. Southampton and Crystal Palace are the two fastest improving sides at the moment.

Into the Championship, and indeed Leeds are the team most exceeding expectations, following on from their strong start, somewhat muted October, and strong November and December. Millwall are the biggest disappointment at half way.


Into Leagues One and Two (below), Macclesfield and Notts County stand out in League Two for seriously, seriously bad performances relative to expectations.

Evaluation – Quick Update – “The Model vs Lawro vs Merse” – Model takes the Lead

The featured image (and below) plots the cumulative forecasting performance of “The Model vs Lawro vs Merse” since round 10  in the Premier League, and as of the midweek fixtures just gone, according to the BBC Sport scoring metric (40 points for a perfect scoreline forecast, 10 points for a correct result only).

The Model got off to a good start, but by round 5 tipster Merson had overtaken the Model, and by round 9 tipster Lawro was getting close (see here).

But now the Model has taken the lead again, overhauling Merson and pulling away form Lawro – and we wouldn’t bet against that trend continuing till the end of the season!



Evaluation – A Quick Update – “The Model vs Lawro vs Merson”

The featured image plots the cumulative forecasting performance of “The Model vs Lawro vs Merson”, according to the BBC Sport scoring metric (40 points for a perfect scoreline forecast, 10 points for a correct result only).

The Model got off to a good start, but hasn’t really picked up the pace as more information about teams’ relative abilities this season has become available, which in theory should have improved its performance.

In the meantime, Merson has overtaken the Model, and Lawro is getting close.

There is a discussion to be had this weekend in the pub, however, whether matches at the start of this season were more predictable than normal. And since then, whether “predictability” in the EPL has fallen, perhaps offsetting any gains the Model has made in its abilities (e.g. the Model is gradually improving its opinion on Wolves, but struggling to distinguish Man U’s drop in form from their long-run average abilities).

The Model is beating the experts

After 40 Premier League matches, the Model is beating the experts. By experts we mean former professional footballers Mark Lawrenson (aka “Lawro”) and Paul Merson (aka “Merse”), who make well-publicised Premier League score forecasts for BBC Sport and Sky Sports, respectively.

Exact Scores:

The harshest performance metric for a football forecaster is the percentage of exact scorelines they get correct. The Model is currently performing at 15%. This just edges Merse, who has predicted 13% of scores bang on. Whereas Lawro is trailing way behind, only getting 5% so far.

Number of Exact Scores Predictions Correct in Premier League 2018/19 Rounds 1-4:

The Model: 6/40

Lawro: 2/40

Merse: 5/40


A more forgiving performance metric is the percentage of results forecast correctly. Again the Model is leading the pack, getting 65% correct. Lawro is getting less than one in every two results correct, 48%. Merse is doing better, 58%.

Number of Result Predictions Correct in Premier League 2018/19 Rounds 1-4:

The Model: 26/40

Lawro: 19/40

Merse: 23/40

“Lawro Points”:

Finally, The Model is clearly outperforming Lawro at his own game, and Merse too for that matter. Using the points scoring system from the BBC Sport predictions game (40 for an exact score, and 10 for just a result), the Model has taken 28% of the points so far on offer. This compares with 16% for Lawro and 24% for Merse.

Accumulated “Lawro points” in Premier League 2018/19 Rounds 1-4:

The Model: 440

Lawro: 250

Merse: 380

Should the Model be more humble?

Probably. 40 games is still a relatively small sample, and there is plenty of time for the experts to turn things around. Lawro remains the biggest threat, given his historical forecast performance outstrips Merse by some distance.


As discussed on this blog previously, we would expect the model to only get better and better as the season progresses.



So close, but yet so far

With fifteen minutes of play left in most matches on Tuesday, our scoreline predictions were spot on in seven matches, and just one goal in a number of other matches could have yielded more exact scores. It looked like it might just be a bonus week.

Leeds then equalised at Swansea, and one by one, until Crawley equalised with the last kick of the evening against nine-man Swindon, all seven disappeared, and none of the possibilities materialised.

At the same time, while we bemoan how close we have been, we’ve also been spectacularly out. We had Stoke to start strongly and win at Leeds. We picked QPR for a surprise win at West Brom, 1-0. The actual score was 7-1 to West Brom. Last night we thought Scunthorpe would beat Fleetwood 1-0 at home, but by the 29th minute they were 4-0 down, and eventually succumbed 5-0 at home.

Humblings all around. We got no scores, below expectation, and we got 14 results out of 34 matches, which means we got about 42% of results. That’s about the same frequency as the number of home wins, meaning had we just predicted a home win in every match, we’d have done about as well.

Which brings to light questions of how do we evaluate our forecasts? Do we just record a zero for a 1-7 when we picked 1-0? And a 1 when that 2-1 does actually happen that we predicted? Or do we sum up how many goals out, so that we were 7 out for West Brom, and 6 for Scunthorpe, (1-0)+(0-(-5))=6?

We plan to develop a little the ways we evaluate forecasts, not least to reflect the way we are evaluating our own forecasts to try and make them better.

Reality Check

Last weekend the Premier League forecast performance was exceptional. This weekend, it looks no better than what people who thought Gay Meadow was just the name of a funfair could achieve.

But with 40 matches played so far in the Football League, the model is still at a par with last weekend in terms of correct scorelines predicted (4/40; 10%), and could improve with 6 games left to play. Overall, this hit rate is a little lower than what we would expect the model to achieve, but we are still tinkering with our exact method.