Having Elo ratings for teams over all time is cool, but how do we know that it’s meaningful? Sure, we can look at the Stanley Cup winning team each year, and see that they typically have a good rating. Or, we can anicdotally look back at our favourite team, remember how good or bad they were for a few seasons in the past, and see that they were near the top or the bottom of the pile at that point in time.
For example, here’s a plot of the Stanley Cup (or, at least the season championship) winning team’s rating and the average rating of the league(s) at that time. Remember, I have WHA data mixed in, you’ll notice that the Houston Aeros fit through the cracks on this quick analysis. And, because the teams history is carried through under their current name, you can see that Arizona Coyotes won the championship at one time (1976 WHA as the Winnipeg Jets). You can see that the winning team is typically ranked much better than the average, as expected.
But, there should be some quantitative things we can check to make sure that a) ratings make a difference in how teams do, and b) if we use it to make predictions, that those have at least some value.
First, we’ll extract the Elo rating for each team going into a game.
Having done that (warning, this is a slow implementation, speeding it up would be very very helpful), we can try making some plots of Elo vs. different aspects of the games’ result. Let’s start with simply looking at the predictive power for each game
#plot scatter of elo adv. (including home) by win proportion.
Is it predictive? Yes…. Is it strongly predictive? I’d say no. There are pleanty of examples where the better team loses, or the worse team wins. At some point there was a team that was rated over 400 points higher, and lost. Similarly, there are pleanty of examples of teams over 300 points worse and losing.
The thing is, we don’t know by what margins these teams won or lost. Maybe we can get more of that information out of a goal differential relationship.
That looks much better. There are many examples of teams with better ratings losing, but they typically don’t lose by much. The inverse is true too.
We can look at the data and say, with more confidence, that there is a loose relationship between Elo rating differential and goal differential. While there’s still lots of uncertainty (as with all macro prediction schemes), there is a relationship.
For those who are curious, the equation of that line of best fit is y=10.3985427293911x-5.45446888161381.
One other thing we can do is plot proportion of wins, losses and ties by Elo Difference:
Wow! Serious correllation here. It’s important to note that Wins include overtime, but not shootout wins, similarly with losses. Shootouts and proper ties are handled as draws. We can see as well that the sum of the linear best fits is quite close to one (about 0.95) and has only a very slight slope (approximately 10e-5), as we would expect for corellations of all data.