Predicting scores (and seasons) by Dixon-Coles is interesting, but it’s one of many ways of doing ‘game-level’ predictions. There’s a family of rating systems called Elo, which was originally developed to rank chess players. There are a number of extensions of Elo, including some modifications to parameters by the World Chess Federation (FIDE), a modification including uncertainty and ‘reliability’ called Glicko, and a more parameterized version of Glicko developed in 2012 called Stephenson. These are all implemented in the PlayerRatings package in R. There’s also an modification of Glicko developed by Microsoft called TrueSkill and this is implemented in the aptly named trueskill package. Note that TrueSkill is a closed licence product, available only for non-commercial implementations.
We’ll compare all of these methods for their historical performance in NHL, as well as (eventually) go into predicting the coming season. TrueSkill has a few oddities, so we’ll look at it later.
For the PlayerRatings package, we’ll need to add one column to our data, the ‘result’ in respect to who won the game. If the home team wins, that result value should be 1. If the away wins, the value should be 0. Logically, draws should be given a 0.5, but there’s other ways of viewing this that I’ll test (could OT wins be a win, but SO wins a draw?).
There’s a nuance to the code as well that prefers dates input as a “numerical vector denoting the time period in which the game took place” instead of dates proper. This allows such analysis as week-by-week analysis for NFL games, or the monthly chess rankings, but doesn’t work as well for our hockey games in which a team may be a few games behind or ahead of the competition.
We’ll wrangle only the data we need:
The PlayerRatings code then chews on games by each day, taking an input of the prior ratings as well as the games on that day. We’ll process the whole history of the NHL for base ELO like this. Note that there are none of the common adjustors, such as regressions to the mean after each season, nor importance factors of playoff vs. regular season games, nor home ice advantage, nor adjustments for the win margin (no benefit to winning 6-1 vs. 2-1 in OT). We’ll use a k factor of 8, which splits the difference between the commonly used 20 for football, and 4 for baseball, and is a commonly accepted k for hockey.
Player
Rating
1
Pittsburgh Penguins
1605.925
2
Washington Capitals
1593.662
3
St. Louis Blues
1593.535
4
Anaheim Ducks
1585.563
5
Chicago Blackhawks
1582.401
6
Tampa Bay Lightning
1579.913
8
Dallas Stars
1574.605
9
San Jose Sharks
1571.384
10
New York Rangers
1566.841
11
Los Angeles Kings
1550.622
13
Nashville Predators
1538.125
14
New York Islanders
1537.453
15
Boston Bruins
1531.571
16
Florida Panthers
1523.907
18
Colorado Avalanche
1517.049
19
Minnesota Wild
1513.761
20
Philadelphia Flyers
1513.489
21
Montreal Canadiens
1512.463
22
Detroit Red Wings
1509.428
23
Ottawa Senators
1507.323
24
Columbus Blue Jackets
1499.695
28
Calgary Flames
1491.623
29
Winnipeg Jets
1490.907
33
New Jersey Devils
1477.679
34
Vancouver Canucks
1475.976
36
Carolina Hurricanes
1458.905
40
Arizona Coyotes
1444.929
41
Buffalo Sabres
1434.475
43
Toronto Maple Leafs
1423.959
44
Edmonton Oilers
1422.394
Similarly, we can get the Glicko and Stephenson ratings:
Player
Rating
Pittsburgh Penguins
2056.885
San Jose Sharks
1961.899
Tampa Bay Lightning
1961.123
Washington Capitals
1919.588
Dallas Stars
1908.682
St. Louis Blues
1906.094
Nashville Predators
1879.422
Philadelphia Flyers
1875.929
Anaheim Ducks
1869.937
Chicago Blackhawks
1855.996
New York Rangers
1852.835
New York Islanders
1842.990
Florida Panthers
1841.062
Buffalo Sabres
1819.547
Ottawa Senators
1818.789
Columbus Blue Jackets
1814.828
Winnipeg Jets
1799.448
Los Angeles Kings
1797.593
Boston Bruins
1794.260
Montreal Canadiens
1791.128
Detroit Red Wings
1778.996
Minnesota Wild
1776.143
New Jersey Devils
1774.495
Calgary Flames
1764.142
Colorado Avalanche
1758.090
Arizona Coyotes
1745.809
Carolina Hurricanes
1739.298
Vancouver Canucks
1722.548
Toronto Maple Leafs
1711.570
Edmonton Oilers
1710.663
Player
Rating
1
Pittsburgh Penguins
1672.636
2
San Jose Sharks
1605.519
3
Tampa Bay Lightning
1604.038
4
Washington Capitals
1571.463
6
Dallas Stars
1561.292
8
St. Louis Blues
1556.631
10
Nashville Predators
1543.061
11
Philadelphia Flyers
1541.821
13
Anaheim Ducks
1529.501
14
Chicago Blackhawks
1526.866
15
New York Rangers
1525.087
17
New York Islanders
1515.128
18
Florida Panthers
1514.468
19
Buffalo Sabres
1509.621
21
Ottawa Senators
1505.674
22
Columbus Blue Jackets
1502.818
23
Winnipeg Jets
1494.553
24
Montreal Canadiens
1487.634
26
Boston Bruins
1479.551
27
Los Angeles Kings
1477.632
28
Minnesota Wild
1471.183
29
Detroit Red Wings
1467.098
30
New Jersey Devils
1466.709
31
Calgary Flames
1466.100
33
Colorado Avalanche
1451.003
35
Arizona Coyotes
1448.379
36
Carolina Hurricanes
1441.683
37
Vancouver Canucks
1435.352
39
Toronto Maple Leafs
1426.483
40
Edmonton Oilers
1424.904
Note that we’ve dropped the historic NHL teams, and this may account for some of the deviation from the average being 1500. For a better idea of the current state of teams, we’ll look at data from 1991-92 onwards, when the San Jose Sharks joined the league.
These result in the following:
It looks like both Glicko and Stephenson ratings are ‘high’ compared to the primary Elo score. Checking the average of this proves to be correct, with Glicko averaging 1661.2783659 and Stephenson averaging 1520.644328. This is not ideal, Elo should, by nature, remain centred around your starting value. Additionally, there’s no normalization at the end of every season, this is a common feature in sports team predictions (vs individual rankings in chess, as Elo was developed for). Next post we’ll build our own Elo ranking that corrects for these and more.