One of the most fascinating discussions that sports fans can have is comparing teams/players from different eras. This is a really tough comparison to make and obviously one will never know the ground truth, since the teams/players debated will never play against each other! So we can turn to the numbers to give us an objective answer! However, this is also not straightforward, since the competition that the two teams faced most probably was different, the style of play most probably was different and key rules might have also been different, among many other factors. However, there might still be a way to start looking at the numbers a little more holistically with appropriate processing. So in this post I will attempt to showcase how one can use z-scores to compare performances in different eras. To do so I will use one of the debates I find myself into very often and this is nothing else than which team I would pick in a 7-game series, the 1990s Bulls or the contemporary Warriors (even though for my subjective self a tougher choice is between the Showtime Lakers and the 1990s Bulls)?
So let’s start with some basics on z-score. The z-score of a data point tells us how many standard deviations above/below the mean of the whole dataset this observation is and is calculated by:
where is the mean of the dataset and is the corresponding standard deviation. This is a way to standardize the data, since if one takes the z-scores for all the observations in a dataset they will have a mean of 0 and a standard deviation of 1. Standardizing observations allows us to make comparisons of data that are in different scales, and can thus be used to make comparison of players/teams of different eras. To understand this better before delving into the Bulls-Warriors debate let’s see a nice example from the Mathletics book. Let’s say that someone asks you to compare Roger Hornsby to George Brett based on their batting average. Hornsby had a .424, while Brett had a .390 batting average. A naive comparison will conclude that Hornsby was more impressive since his batting average was higher. However, in the 1920s when Hornsby played the average batting average .299 with a standard deviation of .0334, while in the 1980s, when Brett played the average batting average was .274 with a standard deviation of .0286. If we calculate the z-scores for the two players we have:
Based on the z-scores, Brett was more than 4 standard deviations better than an average batter in his era (of course Hornsby was still extraordinarily better than his average competition, but not as extraordinary as Brett). In other words, standardization puts the data into context relative to the era’s competition and is a nice tool to keep in mind for such comparisons. Let’s now see how we can use z-scores to compare the Bulls and the Warriors.
Obviously there is not one statistic that can answer this kind of question. Furthermore, there is clearly not a deterministic question, since no team has a 100% win probability against any opponent (unless if the game is over), let alone when you compare two all-time greats. Hence, in principle what we would like to know is what is the win probability of a Bulls (90s) – Warriors (10s) matchup. Earlier in this blog we presented a simple, yet accurate, model for calculating pre-game win probabilities for NBA matchups, namely, the Basketball Prediction Matchup (BPM). While you can see the details at the corresponding post, in brief, BPM relies on a Bradley-Terry model on Oliver’s four factors, namely, effective field goal percentage, turnover rate, offensive rebound rate and free throws to field goal attempts ratio. So what one could do is to take each teams four factors, throw them in BPM and obtain a win probability. Not so fast though! The game has changed dramatically the two decades that separate the two teams. For example, in the 90s the NBA was a big man’s league, while now it is a small man’s league. Not to mention the explosion of 3-point shooting. This latter point particularly has a strong effect on one of the four factors, that is, the effective field goal percentage! And while we are at that the 90s Bulls played two seasons with a shortened 3-point line; it would not be fair to make direct comparisons across eras with similar differences. In order to be a little more smart, we will calculate the z-score for the 90s Bulls four factors and then use these z-scores to project them to contemporary NBA. Now, you might be wondering, is this accurate? Most probably not completely – after all every model is wrong, but since some are useful I think this is a good start. Now we had two options here; (i) to project Warriors factors to the Bulls era, and (ii) to project Bulls factors to contemporary NBA. I chose to do the second since we are conditioned to think contemporary but most importantly because BPM was trained using data from the past couple of seasons.
Using data from basketball-reference.com the following table shows Chicago’s z-scores for the four factors during their 6 championship years (note that negative z-scores for the turnover rate is good):
We will begin by considering all possible team matchups (e.g., 1990-91 Bulls with the 2017-18 Warriors etc.). Hence, in order to project the Bulls performance from season to season we will use the corresponding z-score from season and the league averages of season . For example, the projected eFG for the Bulls for the 2014-15 NBA season would be:
Using these projections we can now use BPM to obtain win probabilities. The following show the win probabilities for the 1990s Chicago Bulls against the contemporary Warriors for home and away games respectively:
As we can see – both from the z-scores and the win probabilities – the 1997-98 Bulls might be the least competitive against the Warriors. Potentially this is due to the bad start of the season playing without Scottie Pippen for the first half abd practically being around .500. However, overall the Bulls would be favorite in 83% of the matchups, while the average win probability for a home game for the Bulls is 60%, while for an away home it is 55%. This is not a terribly telling stat, but the Bulls seem to have a slight (?) edge over the Warriors. We simulated 20,000 7-game series between the two teams using these win probabilities (10,000 with Bulls having the home advantage and 10,000 with the Warriors having home court) and the Bulls won 66% of them. On average Bulls won in 6 games, while Warriors won in 6.5 games. Following is the distribution of the series length:
Obviously the above does not settle the debate by any stretch of imagination. As aforementioned using z-scores is just a rough way of making similar comparisons and unfortunately (or fortunately) we will never know how a game between the two teams would turn out. However, I hope it showcases how one can start thinking about comparing players/teams of different eras! Plus it is a fun way to understand (and teach) standardization.