NHL Elo
Predicting scores (and seasons) by Dixon-Coles is interesting, but it’s one of many ways of doing ‘game-level’ predictions. There’s a family of rating systems called Elo, which was originally developed to rank chess players. There are a number of extensions of Elo, including some modifications to parameters by the World Chess Federation (FIDE), a modification including uncertainty and ‘reliability’ called Glicko, and a more parameterized version of Glicko developed in 2012 called Stephenson. These are all implemented in the PlayerRatings
package in R. There’s also an modification of Glicko developed by Microsoft called TrueSkill and this is implemented in the aptly named trueskill
package. Note that TrueSkill is a closed licence product, available only for non-commercial implementations.
We’ll compare all of these methods for their historical performance in NHL, as well as (eventually) go into predicting the coming season. TrueSkill has a few oddities, so we’ll look at it later.
library(PlayerRatings)
nhl_all<-as.data.frame(readRDS("./_data/hockeyData.Rds"))
For the PlayerRatings package, we’ll need to add one column to our data, the ‘result’ in respect to who won the game. If the home team wins, that result value should be 1. If the away wins, the value should be 0. Logically, draws should be given a 0.5, but there’s other ways of viewing this that I’ll test (could OT wins be a win, but SO wins a draw?).
nhl_all$Result<-apply(nhl_all, 1, function(x) ifelse(x[3] > x[5], 1, ifelse(x[3]<x[5],0,0.5)))
There’s a nuance to the code as well that prefers dates input as a “numerical vector denoting the time period in which the game took place” instead of dates proper. This allows such analysis as week-by-week analysis for NFL games, or the monthly chess rankings, but doesn’t work as well for our hockey games in which a team may be a few games behind or ahead of the competition.
dates<-unique(sort(nhl_all$Date))
nhl_all$DateFactor<-apply(nhl_all, 1, function(x) which(dates == x[1]))
We’ll wrangle only the data we need:
nhl_results<-nhl_all[,c(13,2,4,12)]
nhl_results$Visitor<-as.character(nhl_results$Visitor)
nhl_results$Home<-as.character(nhl_results$Home)
The PlayerRatings code then chews on games by each day, taking an input of the prior ratings as well as the games on that day. We’ll process the whole history of the NHL for base ELO like this. Note that there are none of the common adjustors, such as regressions to the mean after each season, nor importance factors of playoff vs. regular season games, nor home ice advantage, nor adjustments for the win margin (no benefit to winning 6-1 vs. 2-1 in OT). We’ll use a k factor of 8, which splits the difference between the commonly used 20 for football, and 4 for baseball, and is a commonly accepted k for hockey.
nhlelo<-elo(nhl_results, init=1500, kfac=8, history=TRUE)
kable(nhlelo$ratings[nhlelo$ratings$Lag < 100,c(1,2)], caption="NHL Elo Ratings")
Player | Rating | |
---|---|---|
1 | Pittsburgh Penguins | 1605.925 |
2 | Washington Capitals | 1593.662 |
3 | St. Louis Blues | 1593.535 |
4 | Anaheim Ducks | 1585.563 |
5 | Chicago Blackhawks | 1582.401 |
6 | Tampa Bay Lightning | 1579.913 |
8 | Dallas Stars | 1574.605 |
9 | San Jose Sharks | 1571.384 |
10 | New York Rangers | 1566.841 |
11 | Los Angeles Kings | 1550.622 |
13 | Nashville Predators | 1538.125 |
14 | New York Islanders | 1537.453 |
15 | Boston Bruins | 1531.571 |
16 | Florida Panthers | 1523.907 |
18 | Colorado Avalanche | 1517.049 |
19 | Minnesota Wild | 1513.761 |
20 | Philadelphia Flyers | 1513.489 |
21 | Montreal Canadiens | 1512.463 |
22 | Detroit Red Wings | 1509.428 |
23 | Ottawa Senators | 1507.323 |
24 | Columbus Blue Jackets | 1499.695 |
28 | Calgary Flames | 1491.623 |
29 | Winnipeg Jets | 1490.907 |
33 | New Jersey Devils | 1477.679 |
34 | Vancouver Canucks | 1475.976 |
36 | Carolina Hurricanes | 1458.905 |
40 | Arizona Coyotes | 1444.929 |
41 | Buffalo Sabres | 1434.475 |
43 | Toronto Maple Leafs | 1423.959 |
44 | Edmonton Oilers | 1422.394 |
Similarly, we can get the Glicko and Stephenson ratings:
nhlglicko<-glicko(nhl_results, init=c(1500, 300), kfac=8, history=TRUE)
kable(nhlglicko$ratings[nhlglicko$ratings$Lag < 100,c(1,2)], caption="NHL Glicko Ratings")
Player | Rating |
---|---|
Pittsburgh Penguins | 2056.885 |
San Jose Sharks | 1961.899 |
Tampa Bay Lightning | 1961.123 |
Washington Capitals | 1919.588 |
Dallas Stars | 1908.682 |
St. Louis Blues | 1906.094 |
Nashville Predators | 1879.422 |
Philadelphia Flyers | 1875.929 |
Anaheim Ducks | 1869.937 |
Chicago Blackhawks | 1855.996 |
New York Rangers | 1852.835 |
New York Islanders | 1842.990 |
Florida Panthers | 1841.062 |
Buffalo Sabres | 1819.547 |
Ottawa Senators | 1818.789 |
Columbus Blue Jackets | 1814.828 |
Winnipeg Jets | 1799.448 |
Los Angeles Kings | 1797.593 |
Boston Bruins | 1794.260 |
Montreal Canadiens | 1791.128 |
Detroit Red Wings | 1778.996 |
Minnesota Wild | 1776.143 |
New Jersey Devils | 1774.495 |
Calgary Flames | 1764.142 |
Colorado Avalanche | 1758.090 |
Arizona Coyotes | 1745.809 |
Carolina Hurricanes | 1739.298 |
Vancouver Canucks | 1722.548 |
Toronto Maple Leafs | 1711.570 |
Edmonton Oilers | 1710.663 |
nhlsteph<-steph(nhl_results, init=c(1500, 300), kfac=8, history=TRUE)
kable(nhlsteph$ratings[nhlsteph$ratings$Lag < 100,c(1,2)], caption="NHL Stephenson Ratings")
Player | Rating | |
---|---|---|
1 | Pittsburgh Penguins | 1672.636 |
2 | San Jose Sharks | 1605.519 |
3 | Tampa Bay Lightning | 1604.038 |
4 | Washington Capitals | 1571.463 |
6 | Dallas Stars | 1561.292 |
8 | St. Louis Blues | 1556.631 |
10 | Nashville Predators | 1543.061 |
11 | Philadelphia Flyers | 1541.821 |
13 | Anaheim Ducks | 1529.501 |
14 | Chicago Blackhawks | 1526.866 |
15 | New York Rangers | 1525.087 |
17 | New York Islanders | 1515.128 |
18 | Florida Panthers | 1514.468 |
19 | Buffalo Sabres | 1509.621 |
21 | Ottawa Senators | 1505.674 |
22 | Columbus Blue Jackets | 1502.818 |
23 | Winnipeg Jets | 1494.553 |
24 | Montreal Canadiens | 1487.634 |
26 | Boston Bruins | 1479.551 |
27 | Los Angeles Kings | 1477.632 |
28 | Minnesota Wild | 1471.183 |
29 | Detroit Red Wings | 1467.098 |
30 | New Jersey Devils | 1466.709 |
31 | Calgary Flames | 1466.100 |
33 | Colorado Avalanche | 1451.003 |
35 | Arizona Coyotes | 1448.379 |
36 | Carolina Hurricanes | 1441.683 |
37 | Vancouver Canucks | 1435.352 |
39 | Toronto Maple Leafs | 1426.483 |
40 | Edmonton Oilers | 1424.904 |
Note that we’ve dropped the historic NHL teams, and this may account for some of the deviation from the average being 1500. For a better idea of the current state of teams, we’ll look at data from 1991-92 onwards, when the San Jose Sharks joined the league.
nhl_recent<-nhl_all[nhl_all$Date>as.Date("1991-07-01"),]
dates<-unique(sort(nhl_recent$Date))
nhl_recent$DateFactor<-apply(nhl_recent, 1, function(x) which(dates == x[1]))
nhl_results<-nhl_recent[,c(13,2,4,12)]
nhl_results$Visitor<-as.character(nhl_results$Visitor)
nhl_results$Home<-as.character(nhl_results$Home)
recent_elo<-elo(nhl_results, init=1500, kfac=8, history=TRUE)
recent_glicko<-glicko(nhl_results, init=c(1500, 300), kfac=8, history=TRUE)
recent_steph<-steph(nhl_results, init=c(1500, 300), kfac=8, history=TRUE)
These result in the following:
##
## Elo Ratings For 30 Players Playing 29071 Games
##
## Player Rating Games Win Draw Loss Lag
## 1 Pittsburgh Penguins 1585 2156 1116 115 925 0
## 2 Washington Capitals 1573 2061 1007 122 932 21
## 3 St. Louis Blues 1573 2067 1019 139 909 7
## 4 Anaheim Ducks 1565 1879 898 107 874 34
## 5 Chicago Blackhawks 1561 2091 1001 148 942 35
## 6 Tampa Bay Lightning 1559 1938 810 112 1016 6
## 7 Dallas Stars 1554 2069 1059 141 869 20
## 8 San Jose Sharks 1550 2107 984 121 1002 0
## 9 New York Rangers 1546 2091 1021 118 952 37
## 10 Los Angeles Kings 1530 2041 916 148 977 38
## 11 Nashville Predators 1517 1430 673 60 697 19
## 12 New York Islanders 1516 1970 800 124 1046 23
## 13 Boston Bruins 1511 2077 1024 140 913 49
## 14 Florida Panthers 1503 1782 720 142 920 36
## 15 Colorado Avalanche 1496 2090 1058 136 896 49
## 16 Minnesota Wild 1493 1259 582 55 622 36
## 17 Philadelphia Flyers 1493 2106 1045 153 908 36
## 18 Montreal Canadiens 1491 2066 983 131 952 49
## 19 Detroit Red Wings 1488 2198 1266 130 802 39
## 20 Ottawa Senators 1486 1954 883 115 956 49
## 21 Columbus Blue Jackets 1479 1206 487 33 686 49
## 22 Calgary Flames 1471 1989 890 147 952 49
## 23 Winnipeg Jets 1470 1286 518 45 723 49
## 24 New Jersey Devils 1457 2121 1117 137 867 49
## 25 Vancouver Canucks 1455 2073 1008 140 925 49
## 26 Carolina Hurricanes 1438 1987 843 139 1005 49
## 27 Arizona Coyotes 1424 1978 852 138 988 49
## 28 Buffalo Sabres 1413 2042 945 135 962 49
## 29 Toronto Maple Leafs 1403 2041 936 119 986 49
## 30 Edmonton Oilers 1401 1987 796 138 1053 49
##
## Glicko Ratings For 30 Players Playing 29071 Games
##
## Player Rating Deviation Games Win Draw Loss Lag
## 1 Pittsburgh Penguins 1890 83.81 2156 1116 115 925 0
## 2 San Jose Sharks 1795 83.56 2107 984 121 1002 0
## 3 Tampa Bay Lightning 1794 90.86 1938 810 112 1016 6
## 4 Washington Capitals 1753 91.09 2061 1007 122 932 21
## 5 Dallas Stars 1742 89.70 2069 1059 141 869 20
## 6 St. Louis Blues 1739 88.36 2067 1019 139 909 7
## 7 Nashville Predators 1712 88.47 1430 673 60 697 19
## 8 Philadelphia Flyers 1709 87.11 2106 1045 153 908 36
## 9 Anaheim Ducks 1703 87.33 1879 898 107 874 34
## 10 Chicago Blackhawks 1689 89.57 2091 1001 148 942 35
## 11 New York Rangers 1686 90.64 2091 1021 118 952 37
## 12 New York Islanders 1676 89.28 1970 800 124 1046 23
## 13 Florida Panthers 1674 88.13 1782 720 142 920 36
## 14 Buffalo Sabres 1652 89.58 2042 945 135 962 49
## 15 Ottawa Senators 1652 90.95 1954 883 115 956 49
## 16 Columbus Blue Jackets 1648 87.88 1206 487 33 686 49
## 17 Winnipeg Jets 1632 89.44 1286 518 45 723 49
## 18 Los Angeles Kings 1631 88.26 2041 916 148 977 38
## 19 Boston Bruins 1627 89.03 2077 1024 140 913 49
## 20 Montreal Canadiens 1624 90.56 2066 983 131 952 49
## 21 Detroit Red Wings 1612 87.54 2198 1266 130 802 39
## 22 Minnesota Wild 1609 91.32 1259 582 55 622 36
## 23 New Jersey Devils 1607 89.13 2121 1117 137 867 49
## 24 Calgary Flames 1597 88.55 1989 890 147 952 49
## 25 Colorado Avalanche 1591 90.13 2090 1058 136 896 49
## 26 Arizona Coyotes 1579 88.74 1978 852 138 988 49
## 27 Carolina Hurricanes 1572 89.24 1987 843 139 1005 49
## 28 Vancouver Canucks 1555 89.06 2073 1008 140 925 49
## 29 Toronto Maple Leafs 1545 87.70 2041 936 119 986 49
## 30 Edmonton Oilers 1544 93.73 1987 796 138 1053 49
##
## Stephenson Ratings For 30 Players Playing 29071 Games
##
## Player Rating Deviation Games Win Draw Loss Lag
## 1 Pittsburgh Penguins 1686 76.78 2156 1116 115 925 0
## 2 San Jose Sharks 1619 76.62 2107 984 121 1002 0
## 3 Tampa Bay Lightning 1617 80.58 1938 810 112 1016 6
## 4 Washington Capitals 1585 80.35 2061 1007 122 932 21
## 5 Dallas Stars 1574 79.89 2069 1059 141 869 20
## 6 St. Louis Blues 1570 79.17 2067 1019 139 909 7
## 7 Nashville Predators 1556 79.11 1430 673 60 697 19
## 8 Philadelphia Flyers 1555 78.30 2106 1045 153 908 36
## 9 Anaheim Ducks 1543 78.49 1879 898 107 874 34
## 10 Chicago Blackhawks 1540 79.65 2091 1001 148 942 35
## 11 New York Rangers 1538 80.09 2091 1021 118 952 37
## 12 New York Islanders 1528 79.44 1970 800 124 1046 23
## 13 Florida Panthers 1528 78.92 1782 720 142 920 36
## 14 Buffalo Sabres 1523 79.71 2042 945 135 962 49
## 15 Ottawa Senators 1519 80.34 1954 883 115 956 49
## 16 Columbus Blue Jackets 1516 78.89 1206 487 33 686 49
## 17 Winnipeg Jets 1508 79.38 1286 518 45 723 49
## 18 Montreal Canadiens 1501 80.10 2066 983 131 952 49
## 19 Boston Bruins 1493 79.36 2077 1024 140 913 49
## 20 Los Angeles Kings 1491 78.87 2041 916 148 977 38
## 21 Minnesota Wild 1484 80.43 1259 582 55 622 36
## 22 Detroit Red Wings 1480 78.58 2198 1266 130 802 39
## 23 New Jersey Devils 1480 79.48 2121 1117 137 867 49
## 24 Calgary Flames 1479 79.04 1989 890 147 952 49
## 25 Colorado Avalanche 1464 80.03 2090 1058 136 896 49
## 26 Arizona Coyotes 1462 79.09 1978 852 138 988 49
## 27 Carolina Hurricanes 1455 79.53 1987 843 139 1005 49
## 28 Vancouver Canucks 1449 79.31 2073 1008 140 925 49
## 29 Toronto Maple Leafs 1440 78.56 2041 936 119 986 49
## 30 Edmonton Oilers 1438 81.77 1987 796 138 1053 49
It looks like both Glicko and Stephenson ratings are ‘high’ compared to the primary Elo score. Checking the average of this proves to be correct, with Glicko averaging 1661.2783659 and Stephenson averaging 1520.644328. This is not ideal, Elo should, by nature, remain centred around your starting value. Additionally, there’s no normalization at the end of every season, this is a common feature in sports team predictions (vs individual rankings in chess, as Elo was developed for). Next post we’ll build our own Elo ranking that corrects for these and more.