People who study statistics in basketball often look with envy at the numbers in baseball. A hitter in baseball stands at the plate by himself. Consequently, the numbers he generates is primarily about the actions he takes. In contrast, basketball players run with four other players. The interaction between those players appears to diminish the meaning of the numbers generated.
The sport of football highlights the importance of interaction effects. The numbers we see for quarterbacks and running backs are very inconsistent across time. This tells us that numbers in football are not just about the player, but also about the player’s teammates. Consequently, forecasting the future in football – as Brian Burke recently noted – is very difficult.
People often argue that basketball is very much like football. To illustrate this argument, just recently we discussed the impact Trevor Ariza will have on the Houston Rockets. Ariza posted a 0.192 WP48 [Wins Produced per 48 minutes] for the Lakers in 2008-09. Such a mark is nearly double what we see from an average player, indicating that Ariza will help the Rockets in 2009-10. People argued, though, that Ariza had Kobe Bryant as a teammate last year; and without Kobe in Houston, Ariza won’t be nearly as productive.
Although this story was told, the evidence suggests otherwise. Here is what Ariza did in 2007-08 and 2006-07 (much of which was spent in Orlando, without Kobe).
2007-08: 0.225 WP48
2006-07: 0.217 WP48
These numbers suggest that what Ariza does on the court is really about Ariza.
Now let’s think about baseball. The Detroit Tigers just signed Aubrey Huff. Last season Huff posted a 0.912 OPS, a mark that ranked 15th among 147 qualified Major League Baseball hitters. In other words, Huff ranked in the top 10% in baseball. This suggests that Huff is one of the most productive hitters in all of baseball, and therefore, fans of the Tigers – like me — should be thrilled.
When we look at the numbers from this year, though, it’s a very different story. Huff’s OPS in 2009 (prior to coming to Detroit) is 0.726. This mark ranks 136th out of 147 qualified hitters, placing Huff in the bottom 10% in the league.
If we look over Huff’s career we see a similar pattern. Here is Huff’s OPS and ranking from 2003 to 2007:
2003: 0.922 OPS, 19th (top 17%)
2005: 0.853 OPS, 45th (top 41%)
2006: 0.813 OPS, 77th (top 60%)
2007: 0.779 OPS, 103rd (top 65%)
The past six years of Huff’s career demonstrates a great deal of inconsistency. So which Huff did the Tigers add? Are they getting the player ranked in the top 10% in 2008? Or is it the player ranked in the bottom 10% in 2009? It seems likely that even Huff isn’t sure. Huff’s job is to hit a round ball with a round stick, and that’s simply not an activity that can be predicted easily.
In the Wages of Wins we noted that the stories of Ariza and Huff are not unique. The numbers attached to players in basketball are simply more consistent than the numbers we see in baseball. And this means that decision-making should be easier in basketball. For example, the Portland Trail Blazers just signed Brandon Roy – a player ranked in the top 10% in the NBA [in WP48]– to a long-term contract. If Roy stays healthy, the Blazers can count on him remaining a top player in the game. And that will probably be true, regardless of his teammates. A similar story can be told about Chris Paul, LeBron James, and Dwight Howard. If these players stay healthy, teammates can come and go and these players will still rank towards the top of the league. Likewise, a player like Jamal Crawford – who has consistently placed in the bottom half of the league rankings – is not going to transform into one of the top players in the game now that he is with the Hawks.
If only people in baseball had the data we see in basketball. Maybe the recent history of the Tigers would then be quite different. The Tigers finished in second place with 88 wins in 2007. They then added $40 million in payroll, and proceeded to lose 88 games and finish in last place. Given what we know about inconsistency in baseball, this result doesn’t necessarily mean that the Tigers didn’t know what they were doing. The Tigers might have simply suffered the wrath of baseball’s inconsistency.
In contrast, the Pistons added Allen Iverson in 2008-09 and announced that such a move would help the team contend. The data, though, suggested fans of the Pistons were about to be disappointed (and eventually they were).
Hmmm…let’s think about this again. The Tigers have a built in excuse when an entire season goes to hell. The Pistons – because their data is better – don’t get to use this same excuse (although they try). So maybe people in basketball should hope for the data we see in baseball.
-DJ
The WoW Journal Comments Policy
Our research on the NBA was summarized HERE.
The Technical Notes at wagesofwins.com provides substantially more information on the published research behind Wins Produced and Win Score
Wins Produced, Win Score, and PAWSmin are also discussed in the following posts:
Simple Models of Player Performance
What Wins Produced Says and What It Does Not Say
Introducing PAWSmin — and a Defense of Box Score Statistics
Finally, A Guide to Evaluating Models contains useful hints on how to interpret and evaluate statistical models.
Ryan J. Parker
August 19, 2009
Great post. I enjoyed it a lot.
simon
August 19, 2009
Thank you for the post. Speaking of records-keeping, what do you think something that can be added to thecurrent boxscore system that can improve the quality of available stats?
dberri
August 19, 2009
Simon,
Are you asking about basketball? One message I am trying to convey to people is that basketball stats are really very good. I think they are better than what we see in baseball.
People have talked about doing more to track defense (which is not tracked as well as people like in baseball either). But teams play defense as a team. Even when a team isn’t playing zone, there is still a team defensive scheme. So I am not sure how you would assign defensive outcomes to individual players. I have read that that defensive stats at 82games.com are not very consistent over time (suggesting these stats are not really capturing the player).
T-Bone
August 19, 2009
Did I miss something here or does that last sentence contradict the entire conclusion you were trying to draw?
Lior
August 19, 2009
The post proposes three (well, four) possible reasons why baseball stats are so inconsistent. Assume that at before any at-bat the player’s level of play is chosen from a distribution around his “mean level of play”, similar to the Elo model for chess players. Why can’t we measure this “mean” ?
1. Sampling error: baseball players don’t have enough at-bats in a season for external effects (opposing pitchers/ballparks/weather/strategy/injury) to average out.
2. Sampling error: hitting baseballs is hard, so that baseball players have large intrinsic variability in the level of play they present at any particular at-bat, independently of external factors. (Again, enough at-bats should make it possible to measure the mean anyway).
3. Inconsistency of the mean: “mean” player performance varies greatly over time in unpredictable ways (unlike the NBA age-dependent curve).
4. Team effects: I don’t know enough about baseball to know what these might be.
What does research say about these possible explanations?
Basketball research has not yet addressed this question either (and that is my main wish for future research on WP48). Specifically for basketball, we would like to use WP48 as a measure of intrinsic player ability (for example, to predict future performance). For this we need to know the confidence interval: say player A has WP48 of 0.210 and player B has WP48 of 0.180. How sure are we that player is better than player B?
Tball
August 19, 2009
T-Bone,
You missed it. It was subtle. Baseball (GMs) has an excuse that the stats are inconsistent. Basketball management should wish they had similarly inconsistent stats so that they would have a built-in excuse.
simon
August 19, 2009
T-Bone// that was a joke about having a pre-built excuse for failing
dberri// indeed, the current boxscore seems pretty good, but I was just wondering if something simple can be added to boost the predict power of the model. I also remember your comment about Dean Oliver’s notation system in the WoW’s footnote.
Ryan J. Parker
August 19, 2009
Lior, why would there ever be uncertainty in WP48? Clearly if one player’s WP48 is greater than another, than that player is certainly better!
Lior
August 19, 2009
Ryan: if you’d like to do the experiment, I have a simple suggestion. For each player-season, divide all the games played into two halves uniformly at random [this is to avoid biases due to player performance depending on time]. Almost surely (except for low-minutes players) the two WP48 values for the player will be very close. The average square deviation will give us an idea of the accuracy of the number. Weighting by minutes will reduce the noise but bias the measure toward better players?
PS: This seems like an obvious calculation to do, but does NBA playing time correlate better with PER or WP48? [I’d guess PER]
PPS: it would also be nice (but harder) to have a idea of the variability in performance of individual players. Would you rather have a player who gives 0.150 every game or 0.200 for half and 0.100 for half? Did the Mailman really deliver more consistently than the average player?
Lior
August 19, 2009
Oops: I didn’t mean to suggest separateing the two halves of each game, but, for each player, to divide the set of games played into two subsets, each of half the size.
Ryan J. Parker
August 19, 2009
Lior, the simplest case would be to see how well WP48 predicts future wins. Prior to last year, how well could we have predicted the future wins *knowing* how many minutes the players played? I’ve never seen Berri do this analysis, and would help bring insight into how the metric works for predicting the future.
Christopher
August 19, 2009
“2007-08: 0.225 WP48
2006-07: 0.217 WP48
These numbers suggest that what Ariza does on the court is really about Ariza.”
This really is the crux. I’d just like to see more evidence than n = 2 seasons and n = 1 player.
I recall a 0.7 correlation across time. But this would imply that last season’s performance accounts for 50% of next year’s performance (just moving from r to r2 in a SLR).
In any event, If I can convince myself to believe this (and I want to) then this adds a whole different dimension to WOW, the predict one that I have not really seen. That is, I want to see a system whereby WP48 is used before a season to predict wins. I realize this opens the minutes played issue but this would be nothing short of amazing (if it worked). For the record, Erich Doerr posted predicted WL records for this last season using many metrics (including WOW with various minutes played assumptions). I generated a a graph (a Taylor diagram) that “shows” who/what did best. I’d like to post that here, any objections? In that light, I can not figure out how to post an image in blog comments?
Nick
August 19, 2009
Just a question:
At what point does a players performance change from consistency to inconsistency. The Ariza numbers you have are not the same, so clearly we’re not talking about precise consistence, which makes sense. But how much variation is sufficient to make a player’s performance inconsistent?
Lior
August 19, 2009
Ryan: Knowing how well WP48 predicts future wins is something we’d really like to know. In some sense it is the goal of the project. However, I think we should also know how well WP48 represents the past. Let’s say current season WP48 explains 80% of the variance in next season wins. Is that because WP48 isn’t exactly the same as playing skill, or because playing skill varies between seasons? [of course I’m implicitly assuming that playing skill is constant within the season, hence my PPS about game-to-game variability].
Also, knowing the number of minutes the players will play next season would by itself tell us something about how well they are going to play (assuming coaches use ability when assigning playing time; see my postscript above).
Lior
August 19, 2009
Is should be noted that the correlation between current-season WP48 and current-season wins has been studied in detail (that’s exactly the correlation between total team WP and total team wins). Correlating current-season WP48 with next-season wins will thus conflate two issues (relation between WP/WP48 and wins and between current season and the next).
Christopher
August 19, 2009
OK, here is a graphical summary of various predictions for the 08-09 season. This is based on an Erich Doerr post:
https://dberri.wordpress.com/2008/10/28/projecting-the-2008-09-nba-season/
The graphic is here:

A few points: R is the average of B,C,D,E. The “best” prediction was Bill Simmons, followed by Vegas. The “worst” was BR WP. Note the what code means what is included in the Doerr post.
What I’d like to see is pre-08-09 WP48 and actual 08-09 minutes played used to forecast 08-09. This is a “fair” test as it removes the ambiguity in estimating minutes, i.e., focuses more so on WP per se.
dberri
August 19, 2009
Thanks Christopher,
One issue in looking at these projections is that both Pelton (who did the Basketball Prospectus projections) and Doerr only looked at past performance data. No effort was made to account for players coming off of injury. The “analysts” looked at both past data and also considered other factors that would alter player performance in the future. I think this explains why the Nuggets looked so bad in the data projections. Nene’s performance when he is hurt is quite a bit below what he does when he is healthy. An analyst can take this into account. If all you look at is numbers, you miss this point.
Christopher
August 19, 2009
I agree. That’s one reason why I’d like to see actual minutes in 08-09 used with pre-08-09 WP48 used. That would get at injuries somewhat. I think some of the minutes played projections tried to get at this too. Maybe some simple rule to accommodate that problem would be useful… Not sure what that would be off the top of my head.
Also, I’ll apologize for the randomly (but seldom) appearing junk before the figure. Not intentional, I just don’t know a web host that does not do this.
dberri
August 19, 2009
Ultimately I would like to see a projection that looked at all the factors that impacted the future. This includes past performance, age, productivity of teammates, and coaching (in some circumstances). Once you had this you could then think about injuries (which I am not sure how you would quantify). Just looking at past performance alone, though, has problems.
Christopher
August 19, 2009
Again, agree. I’m sure you have you looked at performance as a function of games played or age. Are there some idealized “age curves” by position with decent statistical properties? I think that and some injury adjustment would be a big step forward. That’s the thing with expert judgment, all these subtleties are discounted intuitively.
dberri
August 19, 2009
There is a clear age profile. And that will be in the next book (which is scheduled for release next March). Working on editing right now.
Italian Stallion
August 19, 2009
Dberri,
“These numbers suggest that what Ariza does on the court is really about Ariza.”
I have to disagree with you.
To start with, for your view to be totally correct you would have to assume that players can be consistent like this even if teams were built almost randomly. As you know, GMs and coaching staffs look at the strengths and weaknesses of their existing players and try to add pieces that will complement and improve the team, fit into their own system, etc….
If each GM and coaching staff is doing just a reasonable job (which most do), then the new players can be just as effective. That’s why they were selected to begin with. It’s when they don’t that there can be an issue.
We see that sort of thing from time to time and it usually results in one or more quick trades, someone getting fired etc…
What some analysts are doing is digging deeper into the details of a player’s success to try to locate the EXCEPTIONS to the general rule that a player can remain consistent across teams and with different teammates.
I believe Ariza has been singled out as a possible exception to the rule because many suspect he will be asked to play more than a very minor role in Houston’s offense, may actually have someone guarding him for the first time in his career, and get a lot more minutes.
It remains to be seen if that analysis is correct, but to assume that since most players remain consistent across teams and with different teammates that all will, is almost certainly incorrect.
Man of Steele
August 19, 2009
Christopher,
I’m totally in agreeance with you. Dr. Berri’s protestations over injury-related issues are relevant, I think.
Dr. Berri,
An important distinction needs to be made between the accuracy of statistics and the consistency of statistics. Baseball statistics are far better at describing what happens on the field. Partially because players bat, pitch, and field by themselves, baseball statistics describe virtually everything that happens on a baseball field, unlike basketball statistics (including advanced metrics, from OPS and WHIP to Bill James’ work). Baseball players are inconsistent from season to season, though, so although baseball statistics accurately describe what happens in any given season, its predictive power for other seasons is low.
Basketball statistics, on the other hand, do not capture everything that happens on the court. For instance, there is no advanced metric for Bill James’ that captures defense. Much of passing is beyond the evaluation of “assists.” Basketball statistics, though, do have a great degree of predictive power. Basketball players tend to perform the same, per minute, from season to season.
In summary, baseball’s statistics are better, but basketball players are more consistent. So basketball decision-makers should be jealous of baseball statistics’ descriptive powers, and baseball GMs should be jealous of basketball players’ consistency.
simon
August 19, 2009
men of steel//
I’m still curious why you thought dberri was wrong about Kidd and the Mavericks when he was almost dead right about the outcome of the trade. dberri predicted they would win 55 wins after the trade, and the team’s overall efficiency differential indicated 54 wins. (http://www.basketball-reference.com/teams/DAL/2008.html) Yet you’ve written “Well, here is a case where WP and Win Score seem to be demonstrably wrong about a given player.” (https://dberri.wordpress.com/2008/09/24/jason-kidd-really-did-help-the-mavericks/)
Rumblebuckets
August 19, 2009
Ariza will be asked to do exactly the same things in Houston as he’s always done, and if he does operate with the ball in his hands, he’s in trouble. But he won’t. (One of his strengths is that he tended to only shoot bad shots at the end of a possession in the games I saw, though one of his weaknesses is that sometimes he generates a bad shot by passing up a good one. That was the major improvement he made in the playoffs that allowed to his better percentages. His offensive decision making was much better, both in terms of the speed which it occurred and the fact that it was almost always right.) Landry and Scola are both capable of getting good looks for themselves, on just about anyone. (Evidence, Game 6 against the Lakers.) As is Brooks, especially against slow point guards (and he will face a lot in the West in Fisher, Kidd, Miller), and Lowry is just great and very underrated. If this team resolves to work possessions quickly (which historically Adelman has done), they should be able to get good shots. Where they will come into trouble is in situations where they need someone to bail them out at the end of the shot clock.
As for defensive stats, there are obvious starting points. (It’s not like offense is less team oriented than defense, ie. look no further than the study that showed that shooting percentages are much higher off a pass than not off a pass.)
Where to begin:
Just looking at individual players and which teams they tend to light up gives you a sense of who is a good individual defender. So if you look at individual possessions, who guarded who and what the offensive numbers looked like on these possession v. what they look like on average. I would assume that if you could do this for Garnett for his career, you would find that most players against Garnett were significantly less efficient. Or if you did it for Jason Williams, especially when he was on Sacremento, you would find that opposing players lit him up. Or Steve Nash against pretty much anyone. Or Jason Kidd since the microfracture surgery, especially against quicker point guards. And I would also assume you would find that for the majority of players you would find no real significant value either way.
Then you could also track fouls drawn on defense and fouls drawn on offense as well (this is a very obvious place where a single offensive/defensive player can affect a team outcome, both in terms of getting a team into the bonus, positively or negatively, and in terms of limiting the other teams best players playing time due to fouls- one series in which this happened Golden State v. Utah two years ago, Black Orpheus ie. Baron Davis, basically forcing DWill to sit by drawing charges on him.)
Then if you want to go further, you could keep track of deflection, effective double teams, ineffective double teams, (do you factor defensive 3 second calls into this metric, basically minus a point?) How much is effective screen defense worth? (A quick test, do Marcus Camby’s teams tend to underperform efficiency expectations, based on the stats we do keep, rebounding, blocks, steals, fouls?) How much is an altered shot worth in terms of change in efficiency?
An interesting defensive study that should probably be done at some point as well. Is it better to give up an easy lay-up/dunk than foul? And does this relate to at what point in the quarter the event happens? If so, at what point in the quarter does it become better to foul than to allow the points?
I will say, as defense is a product mainly of effort and athleticism, players who show up well for their positions in rebounding, blocks and steals (though this sometimes occurs because of what would ultimately be considered bad d) are for the most part the best defenders. And if you do all those things well for your position and don’t contribute to negative possessions on offense, you are going to end up doing pretty well in this metric already. Ie. Only a few players grades would change drastically in either a directions.
Man of Steele
August 19, 2009
Simon,
Sorry, I didn’t know anyone was still looking at the “Kidd to the Mavs” article. Looking over the evidence, I may not have been exactly rightm but I think my point might still be worth making. The Mavs were predicted by Dr. Berri and efficiancy rating to win 54-55 games after they aquired Kidd, but they only won 51. While this difference is not huge, it is worth noting that the Mavs were 16-13 and got blasted in the first round of the playoffs after acquiring Kidd, while they were 35-18 beforehand. As Nearly as I can tell from Dr. Berri’s calculations, the Mavs should have won about 54 games this past year (2008-09) but they only won 50. Again the disparity is not huge, but again the Mavs were a signifigant step back from what efficiency metrics (including Dr. Berri’s) said they should be with Kidd. The standard on this blog is that Win Score and WP predict correlate with actual team wins (I’m sorry; I am aware that “correlate” is not the correct mathematical term). In this case, though, it did not. The Mavericks performed worse after trading for Kidd than they did before trading before him, fallinf below expectations. This year, having Kidd the whole year, they once again fell below expectations.
So I think my point is somewhat valid. I’m not sure what that point means, or why it is. I used to think that it was related to point guards taking defensive rebounds out of the hands of big men on their teams, but I’m not entirely sure (although I believe Josh Howard’s rebounding numbers were signifigantly lower this year).
This reminds me of something in Bill James’ book. When writing about some pitcher (I can’t remember who) he points out that pitcher was very productive for the Yankees but was awful for a bad team, I think the Royals, because he allowed many balls to be hit. Teams with solid defenses, like the Yankees, were able to field a signifigant portion of these balls with no problem, and so the pitcher was very effective for the Yankees. A team with a poorer defense, though, had more difficulty fielding some of the batted balls, and so the pitcher was not helpful for them. Perhaps Kidd is similar – he is extremely valuable for a team like the Nets with no productive big men, but he may not be quite as valuable to a team like the Mavs who do have productive big men to rebound. Just a thought
Todd
August 19, 2009
I think people who study stats marvel at baseball, not because player performance is consistent from year to year, as it is in basketball, but because it is much easier to use statistics to determine who the best player was in any given season. (or at least who the best hitter is) In baseball, you can compare the seasons of two sluggers, and using stats like on-base, and slugging, determine which player did more to help his team win. In basketball, as the pages and pages of posts on this site indicate, there is a great deal of disagreement about what contributes to wins. WOW uses regression analysis, but ten different scholars could come up with ten different formulas that are just as predictive but attribute wins quite differently, and ultimately, any attempt to regress box score data must assume that the rebounder is the primary defender in every defensive possession that results in a miss. Usage is another issue. Baseball players are easy to compare against each other because we judge them per at bat. In basketball, different players play different roles, and there has to be some sort of subjectivity in determining how much each role is worth.
simon
August 19, 2009
Man of Steele//
What you’re questioning is not really about the WP, which is more about individual contributions, but the connection between team efficiency differential and actual team record, and that link has been pretty well accepted by almost everyone including Hollinger, Kukatko, Oliver, etc. There’s random variation from year-to-year, but the consensus is that the team efficiency is a better indicator than the W-L record. Also remember that Kidd’s NJ teams in 2004~2006 beat the statistical projection, but his team from 2001~2003 failed to match the expected wins, so it’s hard to find much evidence for what you suggested. Most of it seems pretty random variation. Remember that the good old Pythagorean formula works better with basketball than baseball.
It was a coincidence that yesterday I’ve decided to go back to read some of the old postings and you posted today so I thought I’d ask. Also I forgot how epic the T-Ball vs. Friedman battle was. https://dberri.wordpress.com/2008/12/16/really-the-answer-is-iverson/ Now that’s persistence
Italian Stallion
August 20, 2009
rumblebuckets,
IMO, if Ariza is just asked to do the same things for Houston that he did for he Lakers, then he will probably generate “similar” stats.
I think he might suffer a small bit because even though he’s not going be the focal point of the opposing defense, he’s also not going to shoot as many wide open shots on Houston.
On the flip side, he’s a young and improving player. Therefore, his stats may reflect that improvement as offsetting the disadvantage of not getting getting as many wide open looks. (but who knows)
That’s the funny thing about complex stats.
Sometimes you get the right answer for the wrong reason, but to know that requires a high level of expertise and excellent subjective detailed analysis.
Arturo
August 20, 2009
Cristopher,
Can you post a link to the original article on the model evaluation?
dq
August 20, 2009
I stated looking at your stats and ordered the book. I found this in the meantime
http://www.wagesofwins.com/CalculatingWinsProduced.html
with
Lanier PROD = 622*0.032 + 298*0.017 + 537*-0.032 + 88*-0.015 + 197*0.032 + 518*0.033 + 225*-0.032 + 82*0.033 + 155.3*-0.017 + 93*0.019 + 216*0.022 = 28.57
My problem is that it doesnt equal 28.57 but instead 29.25 – I think there is a difference in rounding versus the number displayed, and wonder if you could tell me that is true.
I also had this, which equals .1130, not .1145
Per 48 minute value of blocked shots and assists = [(330*0.019+ 1840*0.022) / 19,855] * 48 = 0.1145.
don’t know if I’m getting lost in the weeds, and the 2% variance matters or not. Im just trying to understand all that you are doing, and it helps me to try and do the math.
Thanks
dberri
August 20, 2009
dq,
For Lanier’s PROD it is just rounding.
Christopher
August 20, 2009
Arturo, not sure what you mean, can you clarify?
Man of Steele
August 20, 2009
Simon,
You may be right. I certainly didn’t intend to attempt to disprove efficiency differential. I just have a suspicion that a system which values offensive and defensive rebounds equally may occasionally overvalue a player who takes large numbers of defensive rebounds out of other players hands. Jason Kidd is a great passer, a great rebounder, and a poor shooter. On a team like the Nets that has no productive big men, that is fantastic. On the Mavericks, though, he is a great passer and a poor shooter, and he is taking a few rebounds out of Josh Howard’s hands (and perhaps Nowitzki, Dampier, or Diop’s).
I would like to qualify all my comments by stating that I really enjoy and appreciate Dr. Berri’s work. I think it is probably the best player evaluation system out there currently. The only reason I waste time commenting on this blog is because I think it is the best, and I think it could be even better with a little tweaking.
Arturo
August 22, 2009
Christopher
I was just wondering if there was an article or blog post to go with this graph:
“The graphic is here:
http://img263.imagevenue.com/img.php?image=20135_nba_122_402lo.jpg“