The Inconsistent Quarterback Story Told Again in Less than 3,000 Words

Posted on December 6, 2009 by


The Wages of Wins discusses how performance can be measured in both the NBA and NFL.  The Wages of Wins Journal, though, almost exclusively focuses on the NBA.   Why isn’t performance in the NFL discussed more frequently?  The answer to this question can be illustrated by comparing the play of Jay Cutler and Kyle Orton. 

Cutler and Orton Defy the Pundits

The Chicago Bears finished the 2008 season with a 9-7 record, a mark that fell just short of qualifying for the playoffs.  In discussing Chicago’s problems, people tended to focus on the team’s quarterback.  As Table One reports, Kyle Orton – the Bears starting quarterback in 2008 — was ranked 25th (out of 32) quarterbacks in both the NFL’s QB Rating system and the Wages of Wins metrics (i.e. QB Score, Net Points, Wins Produced).

Table One: Final Quarterback Rankings for 2008

In the offseason it became clear that Jay Cutler – a player who ranked 7th in Net Points per Play (and Wins Produced per 100 plays or WP100) – was available.  So the Bears sent Kyle Orton – plus two first round draft picks and a third round pick – to the Broncos for Cutler.  Fans of the Bears rejoiced at this move.  And fans of the Denver Broncos became very, very angry.  In the pre-season the views of both groups of fans were confirmed.  The Bears finished the exhibition season with a 3-1 mark, while the Broncos – led by a less than impressive Orton – finished 1-3.  Many NFL pundits were heard expressing the conventional wisdom:  You simply don’t trade away a “franchise” quarterback. 

And then the real games were played.  As December begins, the Broncos are 7-4 while the Bears are 4-7.  When we look at each quarterback’s stats – reported in Table Two – we see that the 2008 result has been essentially reversed.  Orton now ranks 9th in the NFL in Wins Produced per 100 plays (Wins100) while Cutler is ranked 25th. 

Table Two: Week Twelve Quarterback Rankings in 2009

The reversal in the ranking of these two quarterbacks is hardly unique.  Nine of the quarterbacks ranked in the top 10 this year qualified for the rankings last year.  Of these nine, only four – Drew Brees, Peyton Manning, Philip Rivers, and Matt Schaub – were ranked in the top ten at the end of last year.  And we see the same story at the bottom of the rankings.  Seven of the players ranked in the bottom ten qualified for the rankings last year.  Of these, only two – JaMarcus Russell, and Derek Anderson – ranked in the bottom ten in 2008.

Despite such inconsistency, fans of the NFL – and apparently at least some decision-makers – can be impressed by a quarterback’s past numbers.  Consequently, the Bears can be tempted to give up three draft picks and a starting quarterback for an apparent “franchise” signal caller.  And the Chiefs can give up a second round pick and significant dollars for Matt Cassel (currently ranked 26th). 

The problem facing decision-makers in the NFL is the numbers – which are often cited – don’t tell us very much about the future performance of a quarterback.  A quarterback’s statistics depend on his teammates and the quality of his coaching.  Change the teammates and coaches and you often see the numbers change as well.  Unlike basketball – where player statistics are remarkably consistent from season to season – football numbers suffer from very significant interaction effects.  This means those numbers – which told us that Cassel and Cutler are “great” quarterbacks – may not tell us much about what these quarterbacks will do when these players change teams.

And it’s important to note that this isn’t just some numbers or some quarterbacks.  Less than 25% of a quarterback’s completion percentage and passing yards per attempt are explained by what the quarterback did with respect to these statistics last season.  Less than 10% of touchdowns per pass attempt this season are explained by last year; and when we turn to interceptions per attempt, explanatory power falls to less than 2% (these results come from an examination of 399 quarterbacks who played consecutive seasons from 1994 to 2007).  When we turn to measures such as QB Score, the NFL’s quarterback rating, or the numbers at, again we see inconsistency (explanatory power is less than 20%). 

Such results tell us that what we see from Cutler and Orton in 2008 and 2009 should not be surprising.  Predicting performance of quarterbacks in the NFL is simply very difficult (and this is not just the story I tell, but also the story told by Brian Burke at Advanced NFL Stats).

This is really a fascinating story.  But the story was essentially told in The Wages of Wins.  And I told it again during the 2006, 2007, and 2008 NFL season.   Consequently, this is what I said towards the end of my discussion of the final quarterback rankings in 2008: “…the measurement of performance in football really only tells one story.  The interaction effects in football cause the performance statistics to be inconsistent.  So the players we see perform well today are not necessarily going to perform well tomorrow.  Although I like telling that story, it’s really about all I ever say about the NFL. Consequently, this very long post … might be my last post on football.”

Looking at the NFL Draft Again

But now another aspect of this story has sparked some interest.  Rob Simmons and I recently wrote an academic article examining the relationship between where a quarterback is selected in the draft and how he performs in the NFL.  For many the results were surprising.  As Rob and I report, where a quarterback is taken in the draft is not related to how that quarterback performs in the NFL. 

Once again… it’s difficult to predict the future performance of NFL quarterbacks.  On draft day NFL decision-makers have an even more difficult challenge.  People in the NFL must project how well a quarterback will play in the NFL before he ever plays with — and against — NFL talent.  Now if predicting performance of actual NFL quarterbacks is hard, what should one expect to see when it comes to projecting performance of quarterbacks that are not in the NFL?

Well, here is what Rob and I found.

1.  We did find several factors that predict where a quarterback will get drafted.  Specifically, we find that taller, faster, and smarter (i.e. better Wonderlic scores) quarterbacks get drafted first. 

2. The factors that predict draft performance, though, don’t predict NFL performance. 

3. Given this result, we shouldn’t be surprised that where a quarterback is drafted doesn’t predict how well a quarterback will perform in the NFL.

This is how point #3 was described a few days ago:

… here is a sample of what we found.  After a quarterback has played five seasons in the NFL (minimum 500 career plays), here are the correlation coefficients between draft position and various career statistics:

Completion Percentage: -0.01

Passing Yards per Pass Attempt: -0.02

Touchdowns per Pass Attempt: -0.12

Interceptions per Pass Attempt: 0.00

QB Score per Play: -0.01

Net Points per Play: -0.02

Wins per Play: -0.02

QB Rating: -0.06

Directly below this data — and I mean, directly below this data – I wrote the following sentences:

Our data set runs from 1970 to 2007 (adjustments were made for how performance changed over time). We also looked at career performance after 2, 3, 4, 6, 7, and 8 years.  In addition, we also looked at what a player did in each year from 1 to 10.  And with each data set our story looks essentially the same.  The above stats are not really correlated with draft position.

We should note that although draft position and performance are not related – and our story is the same regardless of when we look at the relationship — draft position and salary are clearly correlated.  To illustrate, JaMarcus Russell has collected millions of dollars to play quarterback in the NFL.  But he clearly has not performed at a level consistent with all those dollars.  And a similar story can be told about David Carr, Ryan Leaf, Tim Couch, Joey Harrington, etc…  Quarterbacks who are drafted early clearly get paid more. They just don’t seem to perform any better.

Reacting to Some Reactions

There have been a few reactions to this result that I would like to address.  Here is a sample of what I have seen.

1. A problem with reading comprehension

Let me start with a response that suggests people don’t always read what’s being said. Despite the sentences I highlighted above, I have read statements like the following (this is comment #10 on Jason Lisk’s post at Pro-Football from one of the bloggers that Steven Pinker cited):

The Berri choice to exclude QBs who didn’t play five years in the league is a pretty fundamental error to make.

Hmmm… pretty fundamental error?  Perhaps a more fundamental error is not reading a single paragraph that, once again, appeared directly beneath the results I posted. 

2.  Per-play vs. Aggregate Measures, Part One

Beyond the issue of reading comprehension skills is the objection some people have voiced to how we examined the correlation between draft position and NFL performance.  Rob and I focused on per play measures — such as completion percentage, yards per pass attempt, interceptions per pass attempt, touchdowns per pass attempt, NFL’s quarterback rating, QB Score per play, Wins Produced per play, and Net Points per play – in examining the link between draft position and NFL performance (again, at a host of different points in a quarterback’s career). 

People have argued, though, that it’s better to look at aggregate measures such as total touchdown passes or total yards.  Such examinations show a stronger correlation between draft position and performance (although not that strong).  And these examinations show that “better” quarterbacks – where “better” is defined in terms of total touchdowns or total yards – tend to be picked first (again, this is not a strong tendency).  Of course, one could define quarterbacks in terms of total interceptions thrown and show the opposite.  Quarterbacks chosen first in the draft throw more interceptions, and since interceptions are not good, this means quarterbacks taken first tend to be “worse”.

The results with respect to interceptions — and passing yards and touchdowns — are driven by the fact quarterbacks taken first tend to play more.  So by focusing on the aggregate measures one is really looking at the link between one decision (a team liked the quarterback on draft day) and another (the team decided it will play the quarterback it liked on draft day). 

The persistence of draft day evaluations in the NFL is reminiscent of a study by Colin Camerer and Roberto Weber offered in a 1999 article looking at the NBA draft.  The Camerer-Weber article looked at the factors that predicted minutes per game in the NBA.  What they found was that draft position could still predict playing time – even after performance was controlled for – years into a player’s career.  It wasn’t that performance didn’t predict playing time.  No, the important finding was that draft position – independent of NBA performance – predicted playing time.  Such results suggest that NBA teams had trouble ignoring sunk costs in making decisions.    

This is essentially what Jason Lisk reported (in a less sophisticated study) with respect to quarterbacks and the NFL draft.  Even after controlling for performance, Lisk reported that draft position predicted a quarterback’s playing time.

Such a story confirms the approach Rob and I took in our examination of quarterbacks and the NFL draft.  Aggregate numbers are biased because draft position is an independent predictor of playing time.  Therefore, one should focus on per-play metrics.

3. Per Play vs. Aggregate Measures, Part Two

One doesn’t need to consider the bias in playing time, though, to defend the choice of per play measures.   In evaluating players in sports we tend to focus on measures that consider how many opportunities given the player.  For example, in baseball we tend to look at batting average, on-base percentage, slugging percentage, OPS, ERA, etc…  In basketball we tend to focus on per-minute measures.  And in football, the basic quarterback rating measure is entirely defined in terms of performance per pass attempt.   

We tend to think quarterbacks are “better” when they have a higher completion percentage and throw fewer interceptions per pass attempt.  Draft position, though, doesn’t predict these measures (or any of the per play measures reported above).  But if teams were getting it “right” on draft day, shouldn’t the quarterbacks taken first have a higher completion percentage, or get more yards per pass attempt, or throw fewer interceptions per pass attempt, or produce more wins per play, etc…?

4. Draft Position and Never Playing

Steven Pinker had one more reaction to the construction of our study.  Pinker – in the New York Times – noted that lower drafted quarterbacks don’t “merit many plays”.  And this somehow establishes that teams are drafting correctly.  Again, though, this is using one evaluation to justify another.  We expect that NFL teams are going to discount players who were already discounted. 

For us to study the link between draft position and performance, we can only consider players who actually performed.  It’s possible that those quarterbacks who never performed were really bad quarterbacks.  But since they never played, we don’t know that (and Pinker also doesn’t know this).  What we do know is that for those quarterbacks who did play, draft position and performance aren’t related.   

Another way to think about this is to consider the careers of Kurt Warner and Tom Brady.  The numbers tell us that Warner and Brady are among the best quarterbacks of the past decade.  Yet both quarterbacks were passed over by teams on draft day (Warner was never selected and Brady was a 6th round draft choice). Are we to believe that Warner and Brady were the only quarterbacks passed over who could really play?  It seems likely that at least some of the quarterbacks who never played really could have contributed to an NFL team.  But once again, we will never know, since these quarterbacks never played.

And one should add once again… draft position and salary are clearly related.  Teams pay much more for a quarterback taken with one of the first ten slots in the draft.  But the evidence doesn’t indicate that these quarterbacks perform better than those taken later in the first round, second round, third round, etc…. 

5. Reacting to an Odd Interpretation of Our Results

All that being said, let me say what we are not saying.  Jason Lisk – in the blog post linked to above — notes that past NFL performance predicts future playing time.  Such a result is not surprising.  Past performance predicts future salaries in the NFL (hence Cassell gets a big payday after last season in the NFL).  How Lisk interpreted these results, though, was somewhat odd.  Here is what Lisk said towards the end of his post:

If you believe that the only reason Carson Palmer has played a lot more than Gibran Hamdan is because Palmer was drafted alot higher, then you can accept Gladwell’s position.

I certainly don’t recall Malcolm Gladwell saying that draft position was the “only” (this is Lisk’s word) predictor of future playing time.  What Gladwell argued – and what we argued – is that draft position couldn’t predict future performance.   At no point have I ever argued that NFL decision-makers don’t consider past performance in determining playing time or salaries.  In fact – as noted above – we have argued that NFL teams do consider past performance.  Unfortunately, past performance is a poor predictor of the future.  Hence, it’s not clear that the acquisitions of Cutler or Cassell will ever generate the returns envisioned when those players were acquired.

So we agree with Lisk when he argues that past performance predicts future performance.  Where we don’t agree is with the assertion that at some point we argued something else.

Another Study Confirming Our Story

Let me close with a comment left by fellow economist Kevin Quinn at Malcolm Gladwell’s blog (you have to go through a large number of comments to get to Quinn’s thoughts):

I am a sports economist and have investigated the predictability of eventual NFL performance by QBs based on the information available just before the draft. While my approach and methods differed somewhat from those employed by… Dave Berri, my results essentially confirm his findings.

Kevin co-authored a working paper that examined the NFL draft and came to – as Kevin notes – a very similar conclusion (across a smaller sample then Rob and I considered).  Again, this result –given what we see when we look at the consistency of performance in the NFL – is not surprising. 

And hopefully this extremely lengthy post answers all the reactions to the study Rob Simmons and I published (and yes, this post is less than 3,000 words – although not very far below this mark).

– DJ

The WoW Journal Comments Policy

For more on the Wages of Wins football metrics see

The New QB Score

Consistent Inconsistency in Football

Football Outsiders and QB Score

The Value of Player Statistics in the NFL