Before Malcolm Gladwell wrote his article on The Wages of Wins in the New Yorker he asked me a series of questions about our research. One of his question was, “How would you say your system differs (and is better than) Hollinger’s PER? I’ve noticed that some of his conclusions vary from yours.”
In writing his article Gladwell did not reference my answer to this question, primarily (and I am speculating here) because to explain how we differ from John Hollinger’s methods he would have to first offer readers of the New Yorker a discussion of how PERs is calculated. When you are limited to less than 2,000 words, such a detour is difficult to take.
A few days ago someone asked if I would comment on PERs in this forum. Here we have no limit on how many words we post, although I will try and come in under 2,000 words.
Let me begin this rather lengthy essay by making a few observations about how Hollinger calculates his Player Efficiency Rating (PER). For this, I am employing his discussion in his Pro Basketball Prospectus 2002.
Offensive and Defensive Efficiency
As noted in The Wages of Wins, Hollinger begins in the same place where we start – specifically he notes that wins are determined by offensive and defensive efficiency. Offensive efficiency is determined by points per possession. Defensive efficiency is determined by points allowed per possession. Having each noted this relationship, though, we go in different directions. We employ regression analysis to determine the value – in terms of wins – of the various components of offensive and defensive efficiency. In other words, we go entirely where the data takes us. Hollinger does not statistically derive his values, but takes a different approach.
Why We Create Models
Having noted the importance of offensive and defensive efficiency, Hollinger proceeds to discuss a variety of measures of performance which serve as building blocks for PERs. These building blocks include Points per Shot Attempt, Pure Point Rating, Assist Ratio, Turnover Ratio, Rebound Rate, and Usage Rate. He defends these measures as “improvements” over existing metrics, often noting that the rankings that result evaluate players in a fashion consistent with what NBA observers would believe. In other words, his metrics fit what he believed about the players before he started.
Unfortunately, this is not the way science works. We do not begin with our beliefs, play with the numbers until our beliefs are confirmed, and then call it a day. Models are not evaluated in terms of whether they are consistent with what we believe, but in terms of their ability to explain what we purport to explain (and furthermore, provide predictive power).
This is point that is often lost in discussions of how to measure player performance in sports. Let’s think about baseball for a moment. People who study baseball would argue that batting average – hits divided by at-bats – is not as good a measure of performance as OPS – on base percentage plus slugging average. The reason for this conclusion is that OPS is a better predictor of runs scored and wins than batting average. In other words, OPS is superior to batting average because it does a better job of explaining how many runs a team scores. One would not argue that OPS is better simply because it ranks players in a fashion that fits our prior beliefs.
In examining Hollinger’s metrics, though, it is not clear that his measurements are trying to explain anything more than what he originally believed about the players. He offers various weights for the statistics the NBA tabulates, and at times it appears he is constructing these weights in terms of points scored. But he never establishes that the chosen weights allow him to predict how many points a team scores or how many games the team wins. Without knowing precisely what and how well PERs explains and/or predicts it becomes very difficult to verify Hollinger’s claim that this metric is “accurate.”
Measuring Shooting Efficiency
Looking at the specific weights Hollinger chooses we see another problem. In discussing the NBA Efficiency metric – which the NBA presents at its website – I argued that this measure fails to penalize inefficient shooting. The regression of wins on offensive and defensive efficiency reveals that shooting efficiency impacts outcomes in basketball. The ball does indeed have to go through the hoop for a team to be successful.
The same critique offered for NBA Efficiency also applies to Hollinger’s PERs, except the problem is even worse. Hollinger argues that each two point field goal made is worth about 1.65 points. A three point field goal made is worth 2.65 points. A missed field goal, though, costs a team 0.72 points.
Given these values, with a bit of math we can show that a player will break even on his two point field goal attempts if he hits on 30.4% of these shots. On three pointers the break-even point is 21.4%. If a player exceeds these thresholds, and virtually every NBA played does so with respect to two-point shots, the more he shoots the higher his value in PERs. So a player can be an inefficient scorer and simply inflate his value by taking a large number of shots.
But again, our model of wins suggests that inefficient shooting does not help a team win more games. Hence the conflict between PERs and Wins Produced. Hollinger has set his weights so that inefficient scorers still look pretty good. We argue that inefficient scoring reduces a team’s ability to win games, and therefore these players are not nearly as effective as people might believe.
Measuring Perceptions
Although PERs may not be the best measure of a player’s contribution to wins, it may offer a good measure of people’s perceptions of performance (after all that appears to be the author’s intent). An earlier version of NBA Efficiency was Robert Bellotti’s Points Created model. The simplified version of Points Created is the same as NBA Efficiency, except Bellotti’s model incorporates personal fouls. In defending this model Bellotti noted in 1992 that “the NBA’s Most Valuable Player has finished either first or second that season in my Points Created rankings.” In other words, Points Created is accurate because it mimics perceptions.
Hollinger has a simplified version of PERs called Game Score, and for the 2005-06 season I found a 98% correlation between NBA Efficiency and Hollinger’s Game Score measure. In sum, it appears that Hollinger, Bellotti, and NBA Efficiency are offering very similar statements about productivity. And one can show – via an examination of voting for the MVP award and the coaches’ voting for the All-Rookie team – that metrics like NBA Efficiency are capturing people’s perceptions of performance.
Evidence Contradicting Perceptions
There is evidence, though, that perceptions of performance in basketball do not match the player’s actual impact on wins. And surprisingly, the evidence has very little to do with Wins Produced. Consider the following:
- Less than 15% of wins in the NBA are explained by payroll. Regressions are nice, but not always understood by everyone. So to further illustrate the lack of association between pay and wins I took another approach. Specifically I ranked the teams in the NBA last year in terms of payroll and then divided this ranking into five equal segments. The results revealed that the teams in the top 20% spent an average of about $78 million on players and won –on average – 35.7 games. The next 20% spent $61 million and won 42.5 games. In the middle we see teams that spent only $54 million and won 39.7 games. When we look at the last two groupings – the teams that spent the least – we see clearly the very weak link between pay and wins in basketball. The 20% of teams ranked just below the middle in payroll won 47.7 games while spending $47 million on players. And the teams at the very bottom of the payroll rankings spent less than $38 million on its players and won 39.5 games. Yes, the teams at the bottom spent less than half what the teams spent at the top and actually won more games.
- Okay, pay and wins do not have a strong link. What does this tell us about player evaluation? In football payroll explains less than 5% of wins. But in football we also see very little consistency in player performance. So decision-makers cannot easily know how to spend money to ensure success in the future. A similar problem – though to a lesser extent – exists in baseball. In basketball, though, players are much more consistent across time. The correlation between a player’s per-minute Win Score this season and last season is 0.84. As we detail in The Wages of Wins, the consistency we observe in basketball exceeds what we observe in either baseball or football. Despite this consistency, though, payroll is still not strongly linked to wins. In sum, decision-makers have a greater ability to predict the future in the NBA, yet the payroll-wins relationship still remains very weak.
- When we look at what determines salary we see the problem. The primary player characteristic that dictates wages in the NBA is scoring. Shooting efficiency, rebounds, turnovers, and steals – factors that all impact outcomes – are not strongly linked to player pay. Given this evidence, we think players are evaluated incorrectly in the NBA. Too much emphasis is placed on scoring, and not enough on all the other factors that impact outcomes.
This is one of the more important stories we tell in The Wages of Wins. Our examination of payroll, salaries, and performance all suggest that players are evaluated incorrectly. Our study of metrics like NBA Efficiency – and now Hollinger’s PERs – indicate that the mistake lies in the valuation of shooting efficiency. Inefficient scorers – like Allen Iverson – are paid for more than their contribution to wins justifies. Players who do not score – but offer other significant contributions to wins – tend to be underpaid.
Is this Important?
A few days ago CNNSI.com reported that new documents have been found that shed light on the invention of basketball by James Naismith. Apparently the game that might have inspired Naismith was called “Duck on a Rock.” A bit more than a century later we are engaged in a debate about how to measure a player’s performance in Duck on a Rock, version II. When we think about it this way, perhaps this is a very trivial issue.
From the perspective of economics, though, the story we tell is important (at least, I think so). What we argue in The Wages of Wins is that decision-makers – even when they have clear objectives and an abundance of information – still can make the same error over and over (we talk about why this can happen in the book). Such a story has clear implications for how we model human behavior in economics. And that is true, even if those implications come from a discussion of how well people are playing Duck on a Rock.
– DJ
Our research on the NBA was summarized HERE.
Wins Produced and Win Score are Discussed in the Following Posts
Simple Models of Player Performance
What Wins Produced Says and What It Does Not Say
Fred Flintstone
November 17, 2006
How do you measure a player’s defensive contribution? Obviously steals and blocked shots are measured, but how about areas of defensive that don’t lend themselves to statistics?
Some defenders are good at preventing entry passes to the post.
Others provide good weakside help.
Many good defenders force turnovers without getting credited in the boxscore.
Watching top defenders like Scottie Pippin, Gary Payton, or Bruce Bowen play, it seems obvious that they provide something towards WINS that your model doesn’t take into account.
kjb
November 17, 2006
This is a very interesting issue. There’s no question that shooting efficiency is extremely important. At the same time, somebody has to take the shots. If I recall correctly, your model treats all shot attempts as a negative. How do you place a value on shot creation? How do you address possession usage, and the effect that usage has on efficiency?
What kind of record does your model predict a team comprised solely of Fred Hoibergs would achieve?
Harold Almonte
November 17, 2006
I think a win-rating with its data weighted in terms of the sport´s won games is a peripherical upgrade of a performance-rating wich data is a summatory of acomplishments weighted in terms of points/possessions (referenced or not to a league average); at least for wins predictions and players economical worth on the floor, not necessarily the bussiness, wich might not allways be oriented to what happen there.
The root problem of all ratings (PER and WP) is still the (whole numbers box score) incomplete data and its original arbitrary interpretation (how really much negative are the negative stats?, what´s the real offensive rebound´s offensive value and defensive rebound´s defensive value?, how we add the not boxscored defense?, are scoring and defense equal halves of the game?, etc) that tends to under-overrate some skills, like shot creation, and under-overrate players with just one or two skills.
All this stuff is becoming in a stupid ratings and statistical methods war, what I think is a wrong way of contribution and lack of vision. Why don´t rather have all analysts a meeting, and go back to basics to data repair first, and then restart the war again?
Jason
November 17, 2006
The defensive rebound’s value is that it means that the opposition didn’t score and thus it’s a reasonable metric of defensive ability. A defensive rebound means that the opposition missed a shot and that’s a large part of what defense tries to do: limit the shots taken and don’t let those shots taken go in. The problem comes when only one person is creditted with the rebound. The rebound may capture all of the defensive effort in terms of final outcome and thus work well in regression for assigning values to components as they relate to wins, but this isn’t the same as saying that the player who got the rebound was the only person responsible for it.
Position adjustment accounts for some of this in comparing players, but it’s still far from perfect. I think it tends to do well with big-men who are going to grab more rebounds because of the position they play and are more likely to benefit from their own defense, but it has more problems with guards. There’s not much of a penalty for a point guard to allow his man to get good looks and sink shots. Sure, he might grab some rebounds, but he’s not expected to grab many and the absense of available rebounds isn’t going to change his rating as much as it would for a center.
Conversely, when Bruce Bowen hassles someone until they launch a clanker with the clock expiring, it’s more likely that Tim Duncan benefits statistically since he’s more likely to grab the rebound. Without the defensive board though, Bowen doesn’t get credit (nor does his team benefit even if he “did everything right”). It’s also possible that he forces a player into making a bad pass resulting in a turnover that is not a steal, and thus he shares credit with the other 4 guys who may or may not have helped out in this effort.
I don’t know if this is a problem with the ‘box score’ though so much as it’s a recognition that *some* of the game is a product of it being a 5-on-5 contest and not 5 1 on 1 contests. [Note the word ‘some.’ To hear the detractors it’s as if the stats are totally meaningless and don’t relate at all to what actually happens on the court and it’s just random noise. There is noise, but I doubt highly that the un-recorded synergistic interactions are sizeable enough to take away from sophisticated statistical evaluations.]
If WP or any other model accounts for the vast majority of a team’s actual record from the component stats, then it’s a sufficient model that’s accounting for the whole game. This doesn’t mean that each player on a team is properly credited and it *may* be that the box score divisions don’t allow for that. Still, the degree to which these credits are or are not properly assigned should be somewhat evident when using the stats for predictive value in future events. If stats don’t capture the bulk of what’s going on, the predictive value of improvement (or decline) when a player changes teams should be off. If most of it is captured by the stats, the new team’s benefit (or penalty) should be closer to the predicted sum of the parts. I don’t have these data, but perhaps Dave can comment.
dberri
November 17, 2006
Great comments. I will try and answer these over the weekend.
Tom
November 17, 2006
This is actually more about competitive balance in basketball. In your book, you suggested that the lack of competitive balance in basketball is because of the limited pool of freakishly tall players. I would suggest instead that the lack of competitive balance in basketball comes more from the nature of the game. In many sports, it’s possible for a one or two big plays to determine the outcome of a game. In basketball, a team has to consistently make plays for the entire game in order to win. Basketball strikes me as a game where there is simply less luck. I imagine we would see similar levels of competitive imbalance in a sport like tennis or volleyball, where many points are played to determine the final outcome.
dberri
November 19, 2006
I tried to answer a few of these questions today. Let me offer a few more comments.
Fred… we can measure the ability of a team to play defense, which we incorporate in our measure of Wins Produced, but obviously the statistics do not give us a measure of how each player on a team played defense. In the future I think we can take some of the plus-minus statistics to supplement our model.
Kevin… we do not evaluate models by considering situations outside the realm of what is possible. No team would be solely constructed of shooting guards. I guess one could ask, how well would a team of Allen Iverson’s do? Again, that is an equally poor question. I would agree that there are players – like Hoiberg – that are productive because of the specific role that player is asked to perform. Again, knowing that can help us see why that player is productive. It does not change how productive that player has been.
Harold… I disagree that the box score data is as limited as people argue. The alternative approach – plus-minus – has very significant limitations which we address in the book. I will try and write a post on plus-minus in this forum in the near future. I do think a few advocates of plus-minus, and adjusted plus-minus, have exaggerated the usefulness of this approach.
twilson
November 20, 2006
Dave, my compliments, this was the best entry I’ve read on your weblog.
Mike B.
November 21, 2006
This critique is fairly disingenuous and subtly mean-spirited in tone — for example, you state that Hollinger “defends” his system. In his 2002 book he explains his system (which he had to — the 2002 book was his first).
Then you mischaracterize PER as some attempt by Hollinger to make NBA statistics conform with pre-conceived notions about what “he believed about the players before he started.” While it’s possible that Hollinger may have made comments in the 2002 Basketball Prospectus that PER had confirmed prior observations about players, in pages 1 through 15 of the book, where he explains his methods, this sentiment is not in evidence. If anything, he goes to great pains to note that perception of the abilities of both teams and players is distorted by conventional statistics.
Thus, when you go on to attack his methodology for PER, by pulling out two of the components, you make it appear to the reader that Hollinger created these values out of thin air. However, he actually explains how he derived these values in the text of the book.
The 1.65 points per made two point field goal is derived from taking assists into account (that is, part of the 2 points should be credited to the passer for assisted baskets). The .72 value for field goals is derived from the fact that 71% of missed field goals are rebounded by the defense.
If you want to critique Hollinger’s methods, go right ahead — it would be very informative. But if you fail to fully and accurately portray his positions, then it is hard to invest much credibility in your critique. Once Hollinger’s position is accurately described, compare it to the Wages of Wins approach and then explain why the WoW approach is superior.
It seems to me that the primary beef that WoW would have with Hollinger (and similar methodologies) is the notion of usage rate. I have always found this concept to be credible — the comment about a team of Hoibergs is spot on. You are correct that no team would do that — in fact, there have been some Bulls teams in the post-Jordan area that seemed to have five Dickie Simpkins. The ability to create your own shot is meaningful to a point.
chew2
November 23, 2006
Perhaps you are using the wrong outcome metric. It is not number of wins, but how much revenue the team or player generates for the owner.
Peni
March 3, 2007
Your site best
inside
March 18, 2007
Hi, this site is very nice…
Katie
August 1, 2007
Very interesting topic, thanks for posting
Elizabeth
November 2, 2007
Big thanx to webmaster!!
Jennifer
December 18, 2007
simple but quality, thanks!d
Steve
August 20, 2008
He really should get a handle on his game play!