Two-point conversion: My two data cents

The PAT kick has turned into as a ceremonial of a play as the ceremonial first pictch in baseball! It is not a stretch to argue that people new in the sport of American football think that a touchdown is worth 7 points instead of 6. For these newcomers to the sport, it might also come as a surprise when they first see the rare scenario when a team tries for a two-point conversion after a touchdown.  Once they realize what is going on, and if they are analytical thinkers, they will be even more confused from the fact that teams do not attempt this play more!!

For an analytical thinker the decision between the PAT kick and the two-point conversion comes down to which provides the highest expected number of points scored (or the highest win probability – I will come back to this later).  In order to calculate this, one needs two things; the success rate s_k of the PAT kick and the success rate s_2 of the two-point conversion play.  Then if 2\cdot s_2 > 1\cdot s_k  the two-point conversion is preferable! So everything comes down to calculating the success rates s_2 and s_k.  Here is where data come in handy! The NFL API provides information for every drive and hence, one can obtain all the touchdown plays and the PAT plays (attempts, success, failures), essentially allowing us to compute s_2 and s_k.

I collected information for the past 7 NFL seasons, when 9021 total touchdowns were scored. Out of them, a whooping 8561 were followed by a PAT kick, with 8245 of them being successful (i.e., s_k = 98.4\%). On the contrary, out of the 460 two-point conversion attempts, 235 where successful leading to s_2 = 51\%. Therefore the expected number of points from a kick is 0.981, while the expected number of points from the two-point conversion is larger and equal to 1.02.  Is it statistically larger though? Calculating the  95% confidence intervals for the expected points we get for the PAT kick [0.97, 0.983] and for the two-point conversion play the interval [0.95, 1.11]. Hence, statistically we cannot really say with much confidence that the two play-call scenarios provide any difference in the expected points.  Nevertheless, we need to also identify that given the small effect size to be detected (if any) the sample size at hand (and in particular the small number of two-point conversion attempts that we can work with) impacts the statistical power of the statistical comparison/test.  In brief, the statistical power captures the probability that our test can detect an effect, if the latter exists.  Computing the statistical power of our test (using the pwr package in R) we obtain a power less than 0.6 (even at a significance level of 0.1). A typical value for a test to be deemed “powerful” is 0.8 and therefore, the overalpping confidence intervals might be just an artifact of an underpowered test!

However here is where things get interesting.  Last year the NFL introduced a new rule for the PAT kick in an effort to make the extra point kick a little more interesting (if this is even possible).  In particular, the extra point kick is now taken from the 15 yard line (instead of the 2 yard line).  This had a significant and crucial impact on the success rate of the extra point kick.  In the following figure we can see the success rate for the PAT kick during the last 7 NFL seasons.


As we can see the rule change led to a lower s_{k}, which now impacts the expected points from that play.  In fact, the 95% confidence interval for the expected points is [0.92,0.954], which still slightly overlap with the interval for the expected points from a two-point play, while at the 0.1 significance level the two intervals do not overlap.

Overall the conclusion that we can draw from this analysis is that NFL coaches need to attempt more two-point conversions in order to allow us to perform statistical comparisons that are not underpowered and hence, can lead us to more robust conclusions.  However, apart from joking, it should be clear that the fact that the dominat decision for PAT is a kick is not justified if we think of the decision as trying to maximize your expected points. At best, since both provide pretty much the same expected points per play and coaches could be indifferent to the choice,  we could expect half of the PATs to be kicks and half of them to be two-point conversion attempts. Of course, this is not what we observe. This does not mean that the decision coaches take is wrong (sometimes); it just means that coaches prefer to take the conventional route, the less variable (since two-point conversions are associated with larger variance).   Coaches might also try to maximize the risk adjusted return, that is, the gain divided by the variance.  What should they try to maximize though?

Up to this point we have mainly focused on the expected points from a PAT try, mainly because it is easily to compute through basic calculations (we only need success rate basically). However, the true objective of a coach should be to maximize the win probability of his team, rather than the expected points (if the game was of an infinite horizon these two might in fact be equivalent, but this is not the case in reality, especially towards the end of the game). This is obviously a more complex decision to make since one needs to calculate: (i) the win probability before the PAT attempt w_{pre}, (ii) the win probability with one extra point on the board w_{pre+1} and (iii) the win probability with two extra points on the board w_{pre+2}. So a win probability model is needed – and there are many out there that can be used. Once we have this the calculation is fairly straightforward. With s_{2} being the success of a two point conversion, the expected win probability after the two-point attempt, is s_{2}\cdot w_{pre+2} + (1-s_{2})\cdot w_{pre}. Furthermore, with s_{k} being the success rate for the PAT kick, the expected win probability is s_{k}\cdot w_{pre+1} + (1-s_{k})\cdot w_{pre}. Based on which one gives the highest expected win probability the coach can make the choice. For example, let us assume that the you just scored a TD with 5 minutes left to cut the lead to 8 points. Should you go for two or for one? The following is a chart that gives you the answer based on your PAT success rate and your  two point conversion success rate.

Screen Shot 2018-10-26 at 12.32.08 PM.png

As we can see in this case the league average decision should be to go for it (in contrast to the common belief and practise). Of course, one can think of the PAT kick as a fixed value (95% for instance) and then basically calculate the break even two-point conversion success rate that your team has to have in order to be beneficial to go for it. In the case above this break even is about 38%, which is below the league average. This break even point is basically the point where the vertical line at the fixed PAT kick success rate intersects the “same” line. One thing that I would like to note here from a practical perspective is that a (generalized) linear model might not be a good approach for this decision making. The problem (at least by using it in its default form) is that it does not understand the difference between being up 2 points, being up 3 points and being up 4 points. It basically considers the increase by one point to be equal regardless of the value of the score differential (it is a linear model so the slope is constant!).   So you need to make sure that the model can capture these non-linearities. Also I find it useful to train the model not on every play in the dataset but rather use only plays at the beginning of the drive (i.e., after the kickoff). This way we have two less parameters that we need to fit (down and yards-to-go), while at the same time we are focusing on exactly the situations that is present when the team has to make the decision. Now the question is that the field position of the opposing drive is not always their own touchback line, but one can account for that by having a weighted average of starting field position based on the kickoff units that compete (this makes the model much more complicated for possible small gain in accuracy).

So bottom line is that the decision on whether to go for 1 or 2 has to account for situations. From an expected points point of view the two options are statistically the same, but certainly what a coach should be interested in is the impact on his team win probability!

Counterfactual question: Because I always like thinking Sci-Fi things, what would be the play-calling if the two-point conversion rule was changed to a three-point conversion? To complicate things even more what if you could choose between (a) extra point kick from the 15 yard line, (b) a two-point conversion from the 2 yard line and (c) a three-point conversion from the 5 yard line?

5 thoughts on “Two-point conversion: My two data cents

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s