Half-term Reports

We can always consult the league tables to find out where teams are – but are they where we expected them to be? It’s hard to avoid the idea that some teams are doing better than expected (Liverpool? Leeds?), and others are performing much worse than expected (Burnley, Millwall, Notts County). Shouldn’t tables reflect that?

Elo rankings offer such an opportunity. They give each competitor (team, individual) a strength score, from which a prediction can be generated for each match that lies between 0 and 1. In football, 0 would be an away win, 1 would be a home win, and 0.5 would represent a perfectly evenly matched event. So if our outcome variable is 0 for an away win, 0.5 for a draw, and 1 for a home win.

So we can look at every match so far this season, and see whether or not teams have exceeded expectations. If a home team had an Elo prediction of 0.45 and won, then the outcome is 1, and the Elo improvement is 1-0.45, hence positive. If on the other hand the Elo prediction was 0.7 and the visiting team won, then the home team’s adjustment would be 0-0.7, hence negative.

So teams that have consistently exceeded expectations would have a sum of adjustments that are positive, but a team that has under performed would have a sum of adjustments that is negative.

In the featured image for this post, we plot the story so far for the Premier League. Watford’s strong start is clear, and Liverpool’s seemingly relentless journey to the summit of the table on Christmas Day is there. However, up until two games ago, when Arsenal lost at Southampton, Arsenal were actually the most improve team.

Champions Manchester City are ninth placed in this table, falling from third on the eve of their match at Chelsea.

At the bottom, things are not surprising. Fulham, despite their summer investment, have struggled badly, as have Burnley following their success last season. Southampton and Crystal Palace are the two fastest improving sides at the moment.

Into the Championship, and indeed Leeds are the team most exceeding expectations, following on from their strong start, somewhat muted October, and strong November and December. Millwall are the biggest disappointment at half way.


Into Leagues One and Two (below), Macclesfield and Notts County stand out in League Two for seriously, seriously bad performances relative to expectations.


The scourge of the goalless draw

It’s the classic trope wheeled out by those who don’t like football – goalless draws, boring! Even those of us who live and breathe football have to confess we’d rather see a game with goals, even if those goals are all a bit comical, as they were in the 2-2 draw between Man United and Arsenal in the week.

Just how frequent are goalless draws? Mark Lawrenson and Paul Merson spectacularly under-predict them; prior to the current season, Lawro had called just 8 0-0 draws in 2,617 recorded predictions, and Merson just 4 in 1,483 recorded predictions (thanks to @MyFootballFacts and @EightyFivePoints for the data). That’s low (about 0.3%), but how low compared to outcomes?

The featured pic for this post shows the frequency per season (northern hemisphere) over the history of data collected on Soccerbase. There’s been quite a bit of variation over time, and perhaps surprisingly for someone who got into football in the late 1980s and early 1990s, that isn’t the period when the most goalless draws were recorded – it’s actually the 1920s.

We can use the econometric technique of Indicator Saturation to determine shifts and outliers in this time series. The R package gets gives us the following plot:


So we see that since the late 1960s, things have been fairly constant, with (persistent) variation around about 8%. The middle panel gives the residuals, hence the difference each year from that mean level of 8%. We’re on a run now of 12 consecutive seasons with less than 8%, but not statistically significantly less than 8%. If the downward trend continues though, we may be looking at a new equilibrium sometime soon.

The art of defending, maybe, is a thing of the past?

Going with your gut: best practice in football score tips (judgemental forecasts)

The development of the Scorecasting Model is only a small part of a wider research project we are undertaking. As part of this project, we have been working with user data from an online sports prediction and fantasy games platform called Superbru (over 1.5million players). We have been analysing the judgement forecasts of players of Superbru’s English Premier League Predictor Game from 2017/18. We have now written our first research paper based on these data, which we have titled “Going with your Gut: The (In)accuracy of Forecast Revisions in a Football Score Prediction Game.”

The paper asks whether players of the prediction game choosing to revise their scoreline picks leads to more accurate forecasts as match kick-offs approach. At first glance, this ought to be a no brainer: of course revising scoreline picks should improve forecast accuracy, because over time there is more information about the nature of an upcoming football match, including injuries to key players and even starting lineups. But there are several possible sources of bias which could affect scoreline tips (or judgement forecasts) and their revisions.

In this research, we find that football tipsters should stick with their gut instincts. This appears to be true quite generally when it comes to forecasting football match scores, both accounting for any differences in forecasting ability between individuals and the differences in predictability between football matches. Revising a forecast (i.e. not sticking with their gut instincts) left the tipsters only 80% as likely to forecast a correct match scoreline compared with when they stuck with their first predictions. In those cases where game players did revise their forecasts, initial scoreline picks were just as good on average as when players didn’t make any revisions. We also found evidence of how game players managed to do worse when they revised their forecasts: their revisions were excessive — perhaps they overreact to some new and salient piece of information about the upcoming match.

These results have some similarities with those found more widely in the academic literature on behavioural forecasting. We hope to use them to guide possible field experiments among communities of sports tipsters (forecasters). One interesting application is whether or not we can find ways to improve the power of crowds in forecasting.

We are also carrying out other research on how to evaluate the football score judgement forecasts made by tipsters, and how forecasting behaviour and evaluation should respond to differences or changes in the “rules of the game” being played by a sports tipster. In general, we find football matches particularly strange objects to forecast, as we have touched on in this blog before. This is mostly explained by the simple fact that frequently the most likely scoreline in any given football match will conflict with the most likely result. There are other situations where this can also be true, and where forecasts could have somewhat greater socio-economic importance.

Lower Leagues, R4 (21-22 August)

Midweek is a busy week in the Football League, with a full set of fixtures in Leagues One and Two. We continue to refine our conditional forecasts, also. There’s a fuzzy area around where all three probabilities (home win, away win and a draw) are similar,  though usually with the draw a little less likely than the home or away win.

But if the draw is at 28%, home win at 32% and away win at 30%, it seems most likely that a draw scoreline is the better forecast than either team to win. It’s an empirical question as to at what point a draw is the right scoreline to predict, relative to a win for either side, and one we will look at in time.

In the meantime, we make use of the measure of entropy thanks to Claude Shannon, a measure which is highest the more “undecided” is a market – i.e. the closer to 33.333% is each probability in a football match. If the measure of entropy is above 1.09 (corresponding to probabilities between 28% and 39%), we conclude the most likely outcome is a draw, and provide a draw scoreline as our prediction. These are our conditional forecasts.

In the next two tables, we present our forecasts constructed this way. In League One, the strong starts from Sunderland and Barnsley seem set to continue (2-1 away wins, both with probability 9%, but both with win probabilities above 40%). In League Two, Lincoln and MK Dons may end up stealing a march on Exeter and Stevenage in the early four-way tie at the summit of the league.

Most likely Win (%)
Score Pr (%) P(H) P(A)
Lincoln Bury 1-0 10% 40% 34%
Macclesfield Cheltenham 1-0 11% 40% 32%
Colchester Crewe 2-1 9% 48% 29%
Cambridge U Exeter 1-0 11% 40% 33%
MK Dons Grimsby 1-0 13% 59% 18%
Tranmere Mansfield 1-1 13% 37% 35%
Morecambe Northampton 1-2 9% 29% 47%
Newport Co Notts Co 1-0 11% 43% 31%
Yeovil Oldham 0-1 13% 30% 40%
Carlisle Port Vale 1-0 11% 46% 28%
Forest Green Stevenage 2-1 9% 39% 37%
Crawley Swindon 1-2 9% 31% 44%
Most likely Win (%)
Score Pr (%) P(H) P(A)
Oxford Accrington 1-0 11% 51% 23%
Rochdale Barnsley 1-2 9% 35% 40%
Bradford Burton 1-1 13% 39% 30%
Blackpool Coventry 1-0 14% 47% 24%
Charlton Peterborough 2-1 9% 43% 31%
Bristol R Portsmouth 1-0 9% 40% 34%
Doncaster Shrewsbury 1-0 11% 48% 26%
Luton Southend 1-0 9% 40% 33%
AFC W’don Walsall 1-0 12% 42% 29%
Plymouth Wycombe 1-0 14% 53% 20%
Scunthorpe Fleetwood 1-0 14% 53% 21%
Gillingham Sunderland 1-2 9% 34% 42%



Championship, R3 (17-19 August, 2018)

Our forecasts for Round 3 of the Championship are in the Table below.

But first, a quick word on how and why we are expanding our range of forecasts:

Over the past couple of weeks, we have extensively evaluated our forecasting model, both on the 2018/19 results, but also on how it would have performed over past seasons.

If we were only interested in forecasting correct scores, then we would happily just report each week on what the “Most likely” scoreline is. However, the model also tells us what the most likely result is, and this will often conflict with our forecast of the most likely score. This is simply because draws are relatively rare among results, but relatively common among scorelines. People care about results, probably more than they care about scores.

Also, we now believe that prediction performance metrics on the BBC Sport and Sky Sports websites, as well as online games such as Superbru, favour conditional forecasts. That is, to perform well on those games, players should first pick what they think is the most likely result, and only then pick the scoreline sticking to that result.

Therefore, we are expanding on our forecasts to reflect all of the above.

In the table below, we now predict:

  • The Most Likely scoreline, with the % chance of that happening
  • The % chance of a win by the Home team, P(H), or Away team, P(A) (with one hundred minus those two numbers giving the % chance of a draw)
  • The Conditional scoreline, if the most likely result happens, with the associated % chance of that happening among all possible scorelines
Most likely Win (%) Conditional
Score Pr (%) P(H) P(A) Score C.Pr (%)
Ch B’ham Swansea 1-1 13 38 34 1-0 12
Ch Ipswich A Villa 1-1 13 31 41 0-1 11
Ch Hull B’burn 2-2 7 39 40 1-2 7
Ch Reading Bolton 2-0 10 66 15 2-0 10
Ch Millwall Derby 0-1 14 31 38 0-1 14
Ch Bristol C M’bro 1-1 11 53 23 1-0 10
Ch Sheff Utd Norwich 1-0 14 62 15 1-0 14
Ch Wigan N Forest 3-2 5 54 29 3-2 5
Ch W Brom QPR 0-1 14 27 43 0-1 14
Ch Leeds R’ham 4-1 7 85 5 4-1 7
Ch Preston Stoke 1-0 18 51 19 1-0 18
Ch Brentf’d Sheff Wed 1-1 12 45 30 1-0 10

Evaluation and Scoring Rules

A bunch of “other” matches took place last night, which we forecast. We also presented conditional probabilities, perhaps a bit more intuitive because while 1-1 is the most likely result, a draw is hardly ever the most likely result.

After the event, the natural question is: how did we do? The table below shows that our standard forecasts (loads of 1-1s) got 15 right results, and 8 scores (from 46 matches), while our conditional forecasts got 22 right results, and 5 scores.

Results Scores Lawro score Sky score Made up score
Forecast 15 8 47 35 31
Conditional 22 5 42 34.5 32

So what is better? To get more scores, or get more results? This all depends on preferences. The scoring rule of Mark Lawrenson’s forecasts on BBC Sport is 40 points for a score, 10 for a result. The Sky Super Six scoring rule, which we might attribute to Paul Merson’s forecasts, is 5 for a score, 2 for a result, thus valuing a score less than the BBC does. By the Lawro score (scaled down to make it comparable to Sky), our unconditional forecasts got 47, our conditional ones 42. By the Sky score, the difference was one half: 35 to 34.5. If we value a score at only twice a result (rather than 2.5 times as Sky does, or 4 times, as Lawro does), then we get that our conditional forecasts were better, scoring 32 to 31.

Scoring rules matter, and may well matter for how players play games. Scoring rules probably also reflect, to some extent, our preferences and beliefs. It’s clearly much harder to get an exact score right, so why not reward that, like Lawro’s score does, by a lot more than just a result?

One-alls, and conditional probabilities

WhatsApp Image 2018-08-11 at 17.00.07
Another 1-1 draw

A few patterns are emerging, after a couple of weeks of forecasting. One is that it’s harder work predicting scores outside the Premier League, but the other is that as scores are individually low probability events, anomalies can arise.

In expanding our model to cope with promotion and relegation, and longer term trends in teams’ strengths, we’re now estimating over more seasons and more divisions. The upshot appears to be that we predict a lot of 1-1 draws. Now, is that a bad thing?

If we look at every single match on the Soccerbase website, we find 11% of matches have finished 1-1 — the most common score ever. With almost every single 1-1 we predict, it’s with a probability of about 11%. So perhaps not the worst thing in the world that our model predicts this quite a lot.

Equally, though, it might be a sign that our model is not really able to distinguish between the kinds of matches that do finish 1-1 and the ones that don’t. It’s all well and good predicting 1-1 every single time, but it’s hardly very insightful.

An alternative is to consider conditional probabilities — probabilities conditional on the most likely result occurring. Now despite 1-1 being the most common score, the most common result is a home win. About 46% of the time, the home team wins, draws only occur about 24% of the time. So a conditional forecast would first determine the most likely result, and then produce as the score forecast the most likely score delivering that particular outcome. In our forecasts for tonight’s games in the National League and EFL Cup, we also produce a column with conditional forecasts.