Arturo Galletti has written an excellent column detailing the calculations behind adjusted plus-minus (APM). This column explains how this measure is calculated, and in the process, casts further doubt on the validity of this approach. Below is a comment on Arturo’s post. Before you read this, please read what Arturo said.
Furthermore, you probably should also listen to our weekly podcast (with Andres Alvarez, Mosi Platt, Arturo, and I). This podcast was devoted almost entirely to this subject.
Okay, now that you have read Arturo’s post and listened to our podcast, here are some additional thoughts. These additional thoughts begin with a review of what I think we already knew about APM.
Problems with Box Score Models
The APM method appears to be a response to the methods used to analyze the box score. So our review begins with the models used to evaluate a player’s box score measures.
Perhaps the oldest model created to evaluate NBA players is NBA Efficiency. This simple model – which adds together a player’s positive stats and subtracts the negative (without any effort to weight these) – has it roots in the TENDEX model created by Dave Heeran. And Dave Heeran said this model goes back about 50 years.
The NBA Efficiency model – as has often been noted – is not highly correlated with team wins. And this is because it rewards inefficient shooting. If a player exceeds a minimum threshold for shooting efficiency (33% from two-point range and 25% from three-point range), the more a player shoots the better will be his NBA Efficiency score (a similar observation is made about John Hollinger’s Player Efficiency Rating). Since inefficient shooting doesn’t actually win games, models with this problem will have a hard time explaining outcomes.
The inability of the NBA Efficiency family of measurements to explain wins has not gone unnoticed by some people. People have seen players with high efficiency marks (like Allen Iverson) leave a team and the team hasn’t actually gotten worse. Or join a team and not make it much better. This has led some to question whether these statistical formulas really capture player performance.
These questions, though, didn’t just cause people to question the formulas. What people have actually questioned is the box score numbers used to calculate these metrics. Because basketball is a team sport, it is reasonable to think that a player’s numbers depend on his teammates. Furthermore, there are events that happen on the court that the numbers don’t capture (like on-the-ball defense). Consequently, those box score number –some argued — can’t be relied upon to measure player performance (of course, such an argument — as discussed before in this forum –ignores the consistency we see with respect to NBA box score numbers).
Moving to Plus-Minus
Such a story has led people to look past the box score numbers at a player’s plus-minus. Plus-minus captures how a team does when a player is on and off the court. The problem with plus-minus, though, is that basketball is a game of five-on-five. So a player’s plus-minus is a function of the following factors: the player’s ability, the ability of his teammates, the ability of the teammates who take the floor when the player is on the bench, and the quality of the opponents the player is facing. Of this list, we really just want to know about the player’s ability. So how do we capture this one factor?
The solution people have offered is adjusted plus-minus (APM). This measure – which several NBA teams have now apparently employed for a few years – is supposed to control for a player’s teammates and opponent. And therefore, it is supposed to be the “best” representation of a player’s ability. But upon further review….
Here is what we know about APM.
As detailed in a published journal article, Stumbling on Wins, a soon to be published article in an academic collection, and the FAQ page for this forum…
The APM coefficients are often insignificant.
For example, consider Corey Brewer. With the Timberwolves this year, Brewer had an APM of 0.57. So according to this number, Brewer was an above average player with the T-Wolves. When we look at Kobe Bryant with the Lakers this year, his APM is -10.87. And that means that Kobe is a below average player in 2010-11. Actually that is an understatement. A mark of -10.87 means that Kobe is just awful this year.
Or does it? For both players the standard error of the coefficient is so large that the correct interpretation of the result is that neither Brewer nor Bryant had a statistically significant impact on the outcomes observed for their team. In other words, because the standard error is relatively large (a general rule of thumb is that the coefficient should be twice the value of the standard error) we cannot differentiate the coefficient from zero. And therefore, we cannot conclude a relationship between the player and outcomes actually exists (i.e. neither Brewer nor Kobe matters for their respective teams).
People have argued that when you add more data the problems of large standard errors will be reduced. This is true, but even when we have more years it is still the case that many of the estimated coefficients appear to be statistically insignificant (Brewer and Bryant both have insignificant coefficients when we look at two years). Furthermore, one reason we see “improved” results with more years is that when you add more data to any model the standard errors will fall (because number of observations is part of the standard error calculation). So that may not mean the model is any better.
The APM coefficients are inconsistent across time
Beyond insignificance we also have a problem with inconsistent measurements across time. Decisions are made about the future. So we don’t want to know if a measure can just explain the past. We need to know whether future measures are correlated with measures taken in the past. For simple plus-minus, year-to-year correlations are quite low.
One might think this is because plus-minus doesn’t control for teammates and opponents. In other words, APM – which supposedly controls for teammates and opponents – would solve the problem observed with plus-minus. But as reported in various places before, only about 7% of a player’s APM this year is explained by the player’s APM last year. And when a player switches teams, the player’s APM this year is not statistically related to his performance the previous season. And that means APM can’t tell you anything about what a player will do when he changes teams. So if you change teammates –something APM is supposed to be controlling for – you don’t get the same APM.
Arturo Deconstructs the Model
The issue of insignificance and inconsistency suggest the APM model can’t be used by decision-makers. But there is yet another issue. Arturo Galletti has offered an extensive discussion of this model that details how it is calculated. And this discussion reveals a few items of interest.
Quoting from Arturo’s article:
….two things jumped out (when Arturo looked at the APM model). One the correlation to wins was very low (~10% R^2) and the +/- numbers don’t quite add at the team level. Somehow they do add up in the final +/- APM numbers.
Let’s talk about the lack of correlation. Arturo notes he looked at this model from a variety of different angles. And as Arturo notes…
Every single regression gave me less that 5% R-Sq. So I feel confident in the statement that the correlation of the model in step 1 (as described) is <5%.
So the model designed to control for the quality of a player’s teammates and the opposition the player faces only explains less than 5% of outcomes. The lack of explanatory power, though, is not something proponents of APM have gone out of their way to highlight.
So how does one take a model that can’t explain outcomes and transform it into something that can? Well, there are two more steps. Again we turn to Arturo’s post:
The model now takes the True +/- values (outcome from step one) for each player from the first equation and regresses those against those player’s stats to determine weights for each stat.
This second step has a reported r-squared of 44%. Again, that isn’t explaining outcomes very well either.
To get to a model that explains outcomes, we have a final step. Again from Arturo…
The final step is to take the Pure regression (step one) and the Stats model (step two) and adds them up by player like so:
APM = x* Pure +/- + (1-x)*Statistical +/-
And proceed to adjust x between 10% and 90% for each player to minimize the error.
So what does that mean?
Here is how Arturo summarizes the explanatory power of the model:
…the r-squared for the APM model is very much a fabrication. The correlation to point margin & wins of the model shown in BasketballValue is artificially inflated by adding the error back in.
Summarizing the Story
So the APM model has the following three characteristics:
1. The coefficients are often not statistically significant. So for most players, the correct interpretation of the results is that the player in question does not have a statistically significant impact on outcomes.
2. The results are very inconsistent over time. So a decision-maker cannot look at past values and use these for decisions about the future (of course, all decisions are about the future).
3. And the model itself doesn’t really explain outcomes. At least, it doesn’t appear to explain outcomes without that very interesting third step.
As Arturo summarizes…the APM model examined does not hold up under scrutiny. It is built to account for all the variability in the process but hold very little actual correlation to the actual process.
One should remember – as Arturo notes – that there is more than one version of the APM model. So it is possible that other versions address these issues. But at this point, we can’t be sure about these other approaches. Or as Arturo put it in the comments section on his post…
The APM model as currently constructed on BasketballValue is not something I can put any credence in at this point, given what I now know about it’s construction. However, models like Wayne Winston’s are interesting as points of references. I do tend to take closed models with a huge grain of salt now. Call me Doubting Thomas.
– DJ
Kent
March 5, 2011
“The inability of the NBA Efficiency family of measurements to explain wins has not gone unnoticed by some people. People have seen players with high efficiency marks (like Allen Iverson) leave a team and the team hasn’t actually gotten worse. Or join a team and not make it much better. ”
Dr. Berri, have you tested your model similarly and done out of sample predictions of how teams will improve/worsen when players join/leave?
dberri
March 5, 2011
The correlation in ADJP48 for all players is about 0.85. If they switch teams, it is 0.76. So players are not as consistent when they switch teams, but still fairly consistent.
Kent
March 6, 2011
Yeah, but I meant the impact on overall team wins. That correlation is compellingly high, but the other calculation I’d do is project team performance after rosters are reshuffled. (I realize minutes allocations, etc. would defray the correlation a bit, but that shouldn’t bias the results systematically.) I guess you address this indirectly w/ your analyses showing how there isn’t much diminishing marginal return to rebounding, but if you’re going to cite players switching team and impact on team performance as support for your model and criticism of others why not do a formal test of it? Thanks
dberri
March 6, 2011
Kent,
Both tests are identical. I looked at the link between a player’s APM this year and his APM last year with a different team. There is no statistical relatonship. I did the same test for ADJP48. There is a relationship, and a fairly high correlation.
Of course, that isn’t the big story here. The big story here is that APM can’t explain current wins without a very odd third step.
Nima Ghamsari
March 6, 2011
Why would you expect something with a high standard error in a single year to be consistent from year to year? Doesn’t make sense.
And I don’t think saying that decision makers can’t use is right. Why wouldn’t they use a rolling, larger window of APM for the player to make decisions?
dberri
March 6, 2011
Nima,
Given how APM is calculated (look at that third step again), why would it make a difference if you looked at this over one year, two years, or 100 years? The model itself doesn’t explain outcomes without that third step.
marparker
March 6, 2011
This post is gonna end up with about a million comments.
Nima Ghamsari
March 6, 2011
I wasn’t making a comment about the methodology they used. It does seem like a huge hack to arbitrarily choose the best “x” to minimize the error for each player.
I was merely commenting on some of the other assertions in your post. Let’s say there IS some “correct” measure with high SE in a single year. That measure will definitely vary from year to year (ie won’t be consistent) but I don’t think that makes it useless for a decision maker.
I’m not saying it’s a good thing. I’m just saying that it should still be considered.
Kent
March 6, 2011
“I looked at the link between a player’s APM this year and his APM last year with a different team. There is no statistical relatonship. I did the same test for ADJP48. There is a relationship, and a fairly high correlation. “
If a player’s ADJP48 is the same even after switching tams it could mean his role is the same on 2 different teams, not that he is contributing the same # of wins to each team. Seeing the win total of the reshuffled roster would implicitly control for interactions, etc.
Mike
March 6, 2011
“..it could mean…”
And if it does, it means that the results would be the same in APM, no? They aren’t so one or tother is wrong…
Carlos_XL
March 6, 2011
@Kent
Of course terribly high se would make a model useless for decision-makers. As Dr. Berri said in the podcast, APM’s most frequent assessment of players is “I don’t know.” That sounds pretty useless, without even getting into the year-to-year variation.
EntityAbyss
March 6, 2011
dberri, WP seems to outshine all these other metrics, but have u looked into the diminishing returns effects on all statistics? So that you could better predict future season performances.
Peter
March 6, 2011
Mike, player minutes would be highly autocorrelated even if a player switched teams. I don’t think it’s enough to just say that a player’s stat being autocorrelated means the value of the player is being appropriately measured. For this win score system to be proven I really think it needs to be tested as predictions on team wins after rosters are reshuffled.
Mike
March 7, 2011
@peter
Chauncey billups traded for Iverson: https://dberri.wordpress.com/2008/11/04/did-i-mention-i-was-an-allen-iverson-fan/
ilikeflowers
March 7, 2011
@Peter
For this win score system to be proven I really think it needs to be tested as predictions on team wins after rosters are reshuffled.
Here you go, courtesy of The Sport Skeptic.
Italian Stallion
March 7, 2011
“Here you go, courtesy of The Sport Skeptic.”
One of the things that may have impacted the Cavs negatively this year is that most models give credit to a player for an assist, but don’t deduct partial credit from the player that scored an assisted basket.
If assists are actually adding value by increasing the efficiency of scorers (which the evidence seems to indicate they do), then the absence of assists will reverse the benefit. Hence the loss of James probably had a bigger negative impact that expected.
ilikeflowers
March 7, 2011
@IS
What sticks in my mind also is regression to the mean both for the individual and the team (and the teammate regressions to the mean back to each individual). I’m wondering if it would be worthwhile (improve predictability enough to warrant the effort) to incorporate this as well as age effects into predictions of future performance for individuals as well as teams.
Kent
March 7, 2011
“@peter
Chauncey billups traded for Iverson: https://dberri.wordpress.com/2008/11/04/did-i-mention-i-was-an-allen-iverson-fan/
”
Mike, that is a single data point. Dave Berri dwells on it all the time but if he is going to use one data point to justify his theory then he should do a full scale test of how well his model predicts *future* wins after players switch teams.
Italian Stallion
March 7, 2011
ilikeflowers,
I think a “case by case” analysis in the hands of an informed analyst will probably outperform some kind automated projection of improvement or mean reversion, but I think even an automated model should be able to outperform no projection at all.
Italian Stallion
March 7, 2011
I think adjusted +/- has some uses, but it’s not something I’d want to make a major decision or big bet based on.
Once you examine how volatile the results are and see more than a handful of totally preposterous single season results, it’s hard to have a lot of faith in any of the numbers (many) that seem to make a lot of sense based on box score data and observation.
I’ve also seen examples of adjusted +/- ratings for the same player from different sources that were totally different because of minor methodology differences.
There’s clearly a lot of room for improvement.
dberri
March 7, 2011
IS,
The basic APM model appears to explain less than 5% of outcomes. So what would be the potential uses?
Here is another question to ask: Did the teams that buy this know that the basic model explains less than 5% of outcomes? Would they have understood if that was explained to them?
Okay, that was two questions to ask.
Aldo
March 8, 2011
Just pondering…
When we ask the question “how good is player X?”, it makes much sense to think of it in terms of how that player contributes to make the team win.
But is that the only possible interpretation? Could a player be better than another and still not produce as many wins? Could there be another premise for a performance model other than wins produced?
Mike
March 8, 2011
@Kent http://sportskeptic.wordpress.com/2011/02/10/predicting-the-past-wins-produced-edition/ not cut it for you?
sophia qu
March 8, 2011
arturo’s looked at a six year old rosenbaum formulation of apm, not the sort that’s up on basketballvalue or other stat resource websites
also how come you value rebounding so goddang much, dave berri? reggie evans looked like an mvp by your metric before he hurt his foot
Italian Stallion
March 8, 2011
dberri
I look +/- stats when a player’s individual defense is generally believed to be either excellent or terrible relative to the rest of his game.
If the overall adj +/- and on/off defensive stats suggest the player is way better/worse than the box score stats and they are in line with generally held beliefs about the player’s defense, I consider that evidence that the player may be more or less valuable than the box score numbers.
If people were forming their opinions about individual defense based on the adj +/- and on off stats that wouldn’t work (lol), but I think most of these opinions are based on player and coaches perceptions of the live action. I’m sure Lebron James knows which players do the best job on him and vice versa.
I wouldn’t consider what I am doing to be highly accurate, but I’m comfortable thinking in terms of something having either positive or negative value without knowing exactly how large the value is.
Carlos_XL
March 8, 2011
Since we have some time between posts, I was hoping someone could answer a question for me about wins produced. I’m a relative newbie.
The wins produced model values rebounds highly. What exactly are we regressing on when we make that statement? RPG? Rebound rate? And as a result, what exactly does that number capture? A guy could grab more of the available rebounds (ie keeping the opponent from grabbing offensive rebounds), or his team could be causing the opponent to miss frequently, which would mean the model values rebounds highly in part because the player is part of a solid defensive unit.
The latter would make total sense to me.
Kent
March 8, 2011
Dr. Berri, any thoughts on this– http://sportskeptic.wordpress.com/2011/02/10/predicting-the-past-wins-produced-edition/ ?
Alien Human Hybrid
March 9, 2011
@ Carlos-
Check this out- it may be helpful: https://dberri.wordpress.com/frequently-asked-questions-and-comments/