Last summer Kevin Durant made his professional debut in the NBA summer leagues. His performance inspired the following two posts:
These two columns made two observations:
a. Kevin Durant played very badly last summer.
b. The media covering summer league basketball didn’t seem to notice that Kevin Durant played very badly last summer.
Durant followed this summer league performance by basically playing poorly his rookie season (and again the media didn’t seem to notice).
The Durant story might lead us to believe that something can be learned from summer league basketball (other than the media’s inability to look past scoring totals). But is that really possible? Does a player’s performance in summer league really tell us about a player’s future performance in real NBA games?
The 2008 Las Vegas Summer League Numbers
Before I answer this question I thought I would examine the performances in the 2008 Las Vegas summer league. A few days ago Erich Doerr graciously sent me the data from Vegas. And with data in hand, we can now offer the first evaluation (if we ignore the Orlando summer league) of the 2008 NBA draft class.
Table One: The 2008 Las Vegas Summer League
Table One presents an evaluation of each player who was
a. chosen in the 2008 draft, and
b. played at least 50 minutes in the 2008 Las Vegas Summer League
Each player is evaluated in terms of PAWS48, or Position Adjusted Win Score per 48 minutes. In essence, each player’s Vegas numbers were evaluated in terms of Win Score, and this number was compared to what an average NBA player – at the player’s position – would have done in 48 minutes. Positive numbers indicate the player was above average (and negative numbers mean… well, I think you can figure that out).
Of the 32 players who satisfied each of the above conditions, only eight (or 25%) was above average. In other words, most rookies played badly – by NBA standards – in Vegas.
Of the eight “good” players, only three – Kevin Love, Jerryd Bayless, and D.J. Augustin – were taken in the first round. The remaining five above average performers were second round choices, with Maarty Leunen and James Gist taken in the last few picks of the draft.
When we look at the “bad”, or below average players, we find a few more lottery picks. O.J. Mayo, Joe Alexander, and Anthony Randolph were each below average performers.
Explaining the Numbers
So what does this mean? Do these numbers indicate that Maarty Leunen is going to be a better pro than Joe Alexander? Will James Gist offer more than O.J. Mayo?
To answer these questions, I thought I would look at the relationship between what the 2007 draft class did in Las Vegas and in their 2007-08 rookie season. Specifically, I collected data on 24 players who were
a. chosen in the 2007 draft
b. played at least 50 minutes in the 2007 Las Vegas summer league
c. and played at least 100 minutes in the 2007-08 regular season.
I then regressed a player’s PAWS48 from his 2007-08 rookie campaign upon his PAWS48 in the 2007 Las Vegas summer league.
The regression resulted in the following equation:
PAWS48 in the NBA = -0.903 + 0.139*PAWS48 in Las Vegas
O.J. Mayo posted a -6.2 PAWS48 in Vegas. Given this model, Mayo’s expected PAWS48 in the NBA is -0.903 + 0.139*(-6.2) = -1.8. With an expected mark below zero, we now have “conclusive proof” that Mayo will not be a “good” NBA player.
Really Explaining the Numbers
Now some people might stop reading after the last sentence and thus believe that I have “proven” Mayo is not going to be a good NBA player. The reality is quite different (there is reason conclusive proof is in quotations).
To understand what this model is actually saying we need to spend a bit more time thinking about the estimated relationship between Vegas PAWS48 and NBA PAWS48. The equation indicates that each one unit increase in Vegas PAWS48 increases the NBA PAWS48 by 0.139. This, though, is just an estimate. And with this estimate comes a standard error.
Before I reveal this standard error, let me review what Ian Ayres calls the “Two Standard Deviation” rule. In Ian Ayres book “Super Crunchers” (a book I really enjoyed and need to talk about more), Ayres states the following:
“There is a 95% chance that a normally distributed variable will fall within two standard deviations (plus or minus) of its mean.”
What does this mean for our regression? The value 0.139 is just an estimate. There is a 95% chance that the “true” value of this coefficient will fall within two standard errors (again, plus or minus) of this value. The standard error for this coefficient is 0.133, which means there is a 95% chance that the “truth” lies between -0.136 and 0.415 (again, you go two standard errors in each direction to get the 95% confidence interval).
Given our confidence interval it could be that
a. the better a player does in Vegas, the better he does in the NBA (or the relationship is positive), or
b. it could be that the coefficient is negative, so the better a player does in Vegas, the worse he does in the NBA, or
c. it could be that the “true” value is zero. This means there is no relationship between Vegas performance and NBA productivity.
In sum, we now know that the relationship between the Vegas numbers and the NBA numbers are positive, or negative, or non-existence. Or in other words, we haven’t learned much of anything.
Actually, let me amend that statement. We did learn something. When we get such results we conclude that the estimated relationship is “not statistically significant.” And once we see this, our discussion of the estimated relationship stops. We would not proceed to forecast Mayo’s NBA performance (as I did earlier). We were not able to find a relationship, and hence our “Mayo story” has to end.
Really, Really, Explaining the Numbers
Although we cannot use this model to evaluate NBA players, there is more to the story. For example, did we actually “prove” that what happens in Vegas stays in Vegas? No, it turns out our model doesn’t even let us reach this conclusion. The simple model has some issues.
First of all, our sample was quite small. We only had 24 observations. Perhaps if we had data from more seasons we could find a relationship.
In addition, and this is perhaps more important, our model may not have been specified correctly. Our model of NBA PAWS48 only considers one explanatory variable [Vegas PAWS48]. Perhaps if we considered other explanatory variables the estimated relationship between Vegas PAW48 and NBA PAWS48 would be different. Specification of a model is extremely important, and if your model is mis-specified your estimated coefficients – and the statistical significance of these coefficients – can be impacted. And this will make the interpretation of your results difficult.
Okay, obviously when you run a regression – and interpret your findings – there are quite a few issues to consider. In fact, there is quite a bit to learn before you even start playing around with this stuff (JC Bradbury wrote a brief post a couple of years ago at Sabernomics detailing what you might want to start reading).
Did We Learn Anything?
Alright, let’s review what we learned from the analysis of the 2008 Las Vegas summer league. I think we learned
a. O.J. Mayo did play poorly in Vegas.
b. other players – who are not as well known – played better.
c. these results, though, do not indicate that O.J. Mayo will have problems in the NBA.
Let me close by noting that one can offer better analysis of the relationship between college and professional performance. And unlike what we see with our Vegas analysis, there is a statistically significant relationship between what we see in college and what we will see in the NBA.
When we look at O.J. Mayo’s college numbers we do see evidence that he might struggle as a pro. Of course, the key word is “might.” Regression analysis is not a crystal ball. It simply reveals tendencies that decision-makers should consider. So despite what Mayo did at USC, it’s possible he will be a productive NBA player. It’s just somewhat more likely that he won’t.
– DJ
The WoW Journal Comments Policy
Our research on the NBA was summarized HERE.
The Technical Notes at wagesofwins.com provides substantially more information on the published research behind Wins Produced and Win Score
Wins Produced, Win Score, and PAWSmin are also discussed in the following posts:
Simple Models of Player Performance
What Wins Produced Says and What It Does Not Say
Introducing PAWSmin — and a Defense of Box Score Statistics
Finally, A Guide to Evaluating Models contains useful hints on how to interpret and evaluate statistical models.
Kent
July 25, 2008
This is way to wishy washy. Take a stand. Dude will Mayo be good or wont he be?
Will
July 25, 2008
Kent,
Dave and Erich Doerr posted a series of columns in late-May/early-June evaluating the 2008 NBA Draft. Search for the term “Mayo” to find them quickly. Based on OJ Mayo’s college performance, it is unlikely that he’ll ever live up to his draft position.
MattB
July 25, 2008
While Kent is a bit blunt, I agree. This was a bit too wishy washy for my liking.
I understand why, just not why it’s worth highlighting.
MarkT
July 25, 2008
I have a couple of points. The obvious is the data set is too small.
Also, you have no idea what a player was asked to focus on in each game. It may be he was asked to work on things for developmental purposes that he will not be asked to do in a real game being played to win. Finally, with guys this young, I don’t think you can expect to project anything beyond his first few months of performance, and not his entire career. I think Mayo will become a very, very good player.
GrizzGM (of Grizzlies Messgageboard fame)
July 25, 2008
You really like picking on the Grizz, don’t you?
Evan
July 25, 2008
MarkT – want to bet on that?
Pete
July 25, 2008
This blog is obsessed with Paws48 to evaluate players. What does plus/minus say? What does Hollinger ratings say?
David
July 25, 2008
What is with the tiny emoticon on the upper right of the page?
Nate
July 26, 2008
Beyond the limited statistical power of summer league evaluations, it’s not really fair to judge players on how they perform before they’ve even gone through a whole pre-season.
One thing I’d like to see more discussion of here is player progression in general. I remember one post about average WP48 for players in different years, but it seems like it could be done in a lot more detail. For example: Do individual players mostly have the familiar growth-plateau-decline production curve, or do players have up and down years in a statistically significant way? What factors auger well or poorly for future production?
Ap
July 27, 2008
Great post Berri….statheads love this kind of process because its what we all go through (in a very simplified form). Its funny, on the other hand sportsfans want opinion…basis or no, all they want is opinion.
Thus we have the albatross that is ESPN.
Jordan
July 27, 2008
I thought Phoenix played in the Vegas Summer League. Where is Robin Lopez on this list? The stats from NBA.com say that he played a little over 82 minutes. His per48 numbers were around 28 pts, 13.5 reb, and 3 blks. I’d be interested to see how these numbers work out using WS48 and PAWS48. Can anyone provide the numbers on Lopez? Thanks.
Oren
July 28, 2008
“The standard error for this coefficient is 0.133, which means there is a 95% chance that the “truth” lies between -0.136 and 0.415 (again, you go two standard errors in each direction to get the 95% confidence interval).”
Doesn’t that mean that there’s a 95% chance that his performance indicates that he is somewhere between being one of the worst NBA players ever to being about as good as Michael Jordan in his prime?
dberri
July 28, 2008
Oren,
No, it means we cannot find a statistical relationship between Vegas production and NBA production.
I am going to write another post on this topic but briefly… when you estimate a regression the null hypothesis for a coefficient is that it is zero. So you are trying to reject the null that there is no relationship between the variables. When your confidence interval is this large, you cannot reject the null. Hence that is the end of your story. You do not proceed to talk about what the coefficient means for that particular player (as I did).
Tom Mandel
August 2, 2008
Your table contains a couple of errors — Jerryd Bayless should be adjusted to 3.9 rather than 2.9, and Marreese Speights played PF not C in the games I watched.
I’ll be interested to see whether James Gist earns a spot on a team based on his good play.
An interesting comparison would be between these kids’ college numbers and their Vegas numbers. E.g. Gist was not this good in college, and Speights was *very* good. Your thoughts on this, Dave?
dberri
August 2, 2008
Hi Tom,
I think Bayless is correct. 9.3 minus 6.4 is 2.9. As for Speights… I used the position listed at ESPN’s draft page. So I wasn’t commenting on where they played in Vegas.
I think the bigger point, though, is that the Vegas numbers really don’t mean anything. Fans of Gist should not be encouraged by what happened in Vegas. And fans of Mayo should not be discouraged by Vegas. What Mayo did in college, though, should be a red flag. College numbers are related to future performance in the NBA.