Metrics typically reported in sports media to help understand the performance of lineups include +/- and efficiency ratings. Overall they do a pretty good job in describing the performance of a lineup except for when they do not. For instance,  during garbage times points scored/allowed can directly overvalue/undervalue the performance of a lineup. Of course, one can eliminate those stints, but then the questions becomes what stints are we going to use to do the evaluation? To avoid this problem we borrow ideas that have been used extensively (at least in the analytics community) in the NFL. Specifically, we can track how the win probability of a team changes while a lineup is on the court. More specifically if $w_s(l)$ is the win probability of l’s team at the beginning of lineups l stint, and $w_e(l)$ is the win probability at the end of the stint, then the win probability added from lineup $l$ is: $\pi(l) = w_e(l) - w_s(l)$. Before going any further we certainly need to build a win probability model. For our case we built a simple logistic regression model where the independent variables used are:

• Current score differential
• Time remaining
• Home (away) timeouts (long/short) taken
• Possession team
• Teams strength differential (many options here –  regression-based ratings, vegas point spreads, win-loss% etc. — we used the latter).

We have also included the following interaction terms:

• Time remaining and possession team
• Time remaining and score differential
• Time remaining and teams strength differential
• Score differential and possession team

We have used play-by-play data to build this model from the 2014-15 and 2015-16 NBA seasons and following is the reliability curve we obtained. As we can see the obtained reliability curve is practically the y=x curve, which means that our in-game win probabilities are well-calibrated.

We can then use this win-probability model to calculate the win probability added (WPA) for a lineup during every stint it played and then calculate its win probability aded efficiency (i.e., win probability added per 100 possessions) as: $\pi_{l} = 100 \cdot \dfrac{\sum_{\sigma \in \mathcal{S}_l} \pi_l(\sigma)}{\sum_{\sigma \in \mathcal{S}_l} p(\sigma)}$. After obtaining these raw win probability efficiencies we can use the Bayesian average to adjust the ratings in such a way that accounts for the different number of observations (i.e., possessions) we have for each team: $\tilde{\pi}_{l} = \dfrac{\overline{\pi}\cdot\overline{N} + \pi_{l}\cdot p_l}{\overline{N}+p_l}$, where $\overline{\pi}$ is the (league) average total win probability added and $\overline{N}$ the average number of observations for all the lineups. We have essentially considered that our prior belief for lineups we have never seen is that of a league average lineup. The more observations we obtain for lineup l, they will eventually outweight our prior belief and $\tilde{\pi}_l \rightarrow \pi_l$ when $p_l \gg \overline{N}$.

But what is the reason for doing all of this? Does this win probability added indeed provide us with a different view of the lineups? To examine that we trained a win probability model using the play-by-play data from the 2014-15 NBA season and calculated the WPA efficiency and the net efficiency rankings for every lineup in the 2015-16 season. We then calculated the ranking correlation between the two rankings obtained from the different metrics.  The correlation coefficient is 0.71 and there are two observatins to be made:

1. The two metrics agree to a large extend with regards to which lineups add value to a team. This is important since the net efficiency rating is in general considered a good lineup evaluation metric.

2. There are measurable differences between the two metrics, which allows us to get different insights with regards to the performance of the lineups. You might be wondering “Then if this WPA efficiency for 5-men lineup is so great then why isn’t used more?”. The major drawback of this approach is the fact that it is dependent on the in-game win probability model used, and therefore, different models might provide different ratings. In contrast, the efficiency ratings are purely based on points scored and allowed, which is unambiguous. Of course, this is not something that hasn’t happened before in sports analytics. For example, (and full disclaimer I am not a big baseball fan and hence, I am not very well aware of  the analytics literature) in baseball there are many different versions of wins above replacement! What is important is the high level idea; how one develops it further and implements it is secondary – at least at this stage – but certainly is critical for the success of the “metric”.