Very early on Monday morning Henry Abbott – of TrueHoop – posted the following:
June 15, 2009 1:29 AM
Champion of the 2009 TrueHoop Stat Geek Smackdown.
Berri, the lead author of the Wages of Wins, beat ESPN’s John Hollinger by seven points. Hollinger picked every series of the playoffs right, except for three — Houston vs. Portland in the first round, and both Conference Finals. Berri, meanwhile, correctly predicted that the Lakers would beat the Nuggets.
Soon afterwards my phone just started ringing. Members of the media from around the world kept calling, wondering the secret to my victory. And here is the story I told…
It all began when the Lakers acquired Pau Gasol. At the time Andrew Bynum was still hurt. But I argued that once Bynum was healthy, the Lakers with Kobe Bryant, Gasol, Bynum (as well as Trevor Ariza and Lamar Odom), would be the favorites to win the NBA title. That was my argument in the midst of the 2008 NBA Finals (when Bynum was still hurt). And that was what I argued at the onset of the 2008-09 season.
Now that the Lakers have indeed won the 2009 NBA Championship we can take a step back and evaluate my immense predictive powers. Not only did I correctly identify the winner in thirteen of the fifteen 2009 playoff series, I also correctly identified the NBA champion in 2009 more than twelve months before it happened. Such a result clearly indicates that what is said about basketball in The Wages of Wins is correct.
Then again….
Let’s take a step back and identify two problems with the previous two paragraphs.
1. Okay, no one actually called me (not even Henry).
2. Although I did pick the Lakers to win more than 12 months ago, one crucial detail was incorrect. The key to the Lakers was going to be the return of Andrew Bynum. Bynum the person did return to the court, but the Bynum we saw in 2007-08 did not. In 2007-08, Bynum posted a WP48 [Wins Produced per 48 minutes] of 0.394 (average is 0.100). This past season his WP48 was 0.198. Such a mark is quite good, but not nearly as good as what we saw in 2007-08. Plus, Bynum was again hurt. In 2007-08 he only played 35 games before a season ending injury. This past season he only managed to appear in 50 games.
When we look at Bynum we see a clear decline in performance. This could be due to
- Bynum’s injury problems.
- diminishing returns (the team did add Gasol), although this effect tends not to be so large.
- the possibility that what Bynum did in 1,008 minutes in 2007-08 was not representative of what he will do over time in the NBA.
All three explanations might have some validity (although, once again, the diminishing returns story probably cannot explain the size of the decline we observe). Regardless, I assumed Bynum would return to what we saw in 2007-08 and therefore concluded the Lakers were the clear favorite to win the title in 2009. The Lakers did win, but I am not sure that would have happened if Orlando didn’t upset Cleveland. In fact, I was quite happy to see that upset. Picking LA to beat Orlando seemed much easier than trying to guess if Cleveland or LA would be victorious in the NBA’s Kobe-LeBron dream finals.
Learning from the Smackdown
So what have we learned from the entire TrueHoop Smackdown experience?
Picking playoff games is really not a very good test of a model or someone’s analytical skills. As I have been saying all along (see HERE and HERE and HERE), a seven game series is too short for predictions to be made with perfect accuracy. In other words, the best model can be done in by the randomness of a small sample.
That being said, I think we have some evidence for the elements of a “best” model (at least “best” under the circumstances of the playoffs). Two years ago I lost to Justin Kubatko in this contest. As I noted at the time, both of us made our picks according to a team’s efficiency differential. Kubatko, though, considered home court advantage and I did not. Because both efficiency differential and home court advantage matter (and this can be seen statistically), Kubatko had the better model and he probably deserved to win.
Last year I did not participate and Kubatko repeated as champion. This year I was back and Kubatko was absent. But with a model based on efficiency differential and home court advantage I was victorious. So we have now seen three consecutive seasons where the person who only considered these two factors managed to win the contest. Once again, the playoffs are not strictly pre-determined by these two factors. Upsets can and will happen. And when this happens – as I have argued – one must try and resist the urge to cling to ad hoc explanations. In other words, sometimes s**t happens, and when that happens that should be your explanation.
Of course, you can’t say s**t on TV. This must be why people on TV feel the need to offer ad hoc explanations (then again, maybe not).
– DJ
The WoW Journal Comments Policy
Our research on the NBA was summarized HERE.
The Technical Notes at wagesofwins.com provides substantially more information on the published research behind Wins Produced and Win Score
Wins Produced, Win Score, and PAWSmin are also discussed in the following posts:
Simple Models of Player Performance
What Wins Produced Says and What It Does Not Say
Introducing PAWSmin — and a Defense of Box Score Statistics
Finally, A Guide to Evaluating Models contains useful hints on how to interpret and evaluate statistical models.
Nick
June 17, 2009
Well written acceptance.
This is a very good explanation of your victory. In reality, luck always plays a big role in these sort of things. Good to see that you acknowledge the fact that, while your model is sound, it still requires a great deal of outside factors, that in fact noone can truly predict. Rather than others who claim to be able to predict all of the unpredictables.
P-Dawg
June 17, 2009
Few questions:
1) Given Bynum’s performance, would the Lakers have been the favorites to beat the Celtics, if Kevin Garnett and Leon Powe had been healthy?
2) Be interesting to see PAWS48 for the playoffs, and how they differed from the regular season.
3) DYING to see a PAWS48 ranking of the 2009 and 2010 free agent classes. There’s a lot of talk about 2010 being the best free agent class ever. True? And where should a smart GM put their dollars?
Ray
June 17, 2009
It’s not a real win without Kubatko! It’s like Houston winning their championships while Jordan was retired… just kidding, sort of
k
June 17, 2009
An ad hoc explanation is better than no explanation at all (e.g. “s**t happens”)
Evan
June 17, 2009
McHale gone from the Twolves. They should know that I’m available, and I can guarantee to deliver way more wins than that guy.
Evan
June 17, 2009
Prof —
So 3 times in a row, the guy with the best model won. Hmm. Could just be positive variance, or…
Is it possible that better teams (by pt differential) win more often than the data would suggest? That is, the underdogs win less series than would be expected.
[A few factors seem possible here: tactical coaching might matter more in the playoffs, some teams — eg the Spurs a few years ago — play their best players less in the regular season than other teams, or what i’ll call the “Shaq really wasn’t trying as hard in the regular season” factor.]
I am tempted to crunch some numbers.
brgulker
June 17, 2009
So, then congrats are in order for picking the best model, not the winner?
Italian Stallion
June 17, 2009
I would like to see how any of these models does against Las Vegas Odds Line going into each series. LOL
Ray
June 17, 2009
Hey! There can’t be 2 Rays on here. Who’s letting people go by the same screen name on here!
brgulker
June 18, 2009
You should switch to Ray the First :P
romalley
June 18, 2009
Or Sugar Ray
UB
June 18, 2009
Just a note –
In actuality, Kurt from Forum Blue and Gold won the overall title, as Hardwood Paroxysm combined their bloggers with Henry’s True Hoop smackdown – Kurt had 72 points going into the Finals (http://www.hardwoodparoxysm.com/2009/06/02/espn-truehoop-blogger-smackdown-update), and obviously picked LA to win, giving him a minimum score of 77 (theoretically it could be 79, but I don’t think he predicted LA in 5).
So a blogger with no statistical model (of which I’m aware) actually beat the whole lot – and really, many of the bloggers listed in the HP group outperformed the statheads.
Interesting.
Nick
June 18, 2009
UB. If everyone was allowed to participate in the competition, even if they had no knowledge at all of the NBA, someone would be bound to beat those in the know, just by throwing just sheer numbers at the competition.
You can’t automatically believe that the winner knows best. It’s not definitive proof.
This is an argument that I think will never win though. People base theories on results. And if actual results don’t match predicted, then the system is determined a fail. In reality, the key here is predicting the most reliable system. Which in the long term does the best job of analysis. Inconsistencies are inevitable. There is so much luck involved in ANYTHING of this sort, that it will never be 100% possible to predict anything.
In the end, what makes sports so appealing to most, is that it’s like a TV show, in which, even the actors or directors themselves, have no idea of what will happen.
UB
June 19, 2009
Nick – it’s true that unlimited participation would produce outliers. However, a cross-section of the top NBA bloggers (the participants allowed by Hardwood Paroxysm) is no more a ‘random sample’ than a cross-section of ‘stat geeks.’ The rules were the same as well.
I agree that one can’t believe the winner automatically knows best. But the inference above was made that “3 times in a row, the guy with the best model won.” I’m pointing out that among a group of knowledgeable fans, that model did less successful than ‘non-stat’ analysis. And in fact, most of the bloggers outperformed their ‘stat-geek’ counterparts.
Mike
June 19, 2009
Just wanna piggy back on UB’s comment… I was going to bring up that Kurt from Forum Blue & Gold won the overall competition. I really think there can be a fair discussion about combining simple statistical analysis with traditional scouting. They are not mutually exclusive.
Brian Tung
June 25, 2009
@UB: The fact that top NBA bloggers don’t constitute a random sample only makes Nick’s and Berri’s point all the more, I think. Since the bloggers would outperform J. Random SportsFan, it makes it harder to conclude anything from the winner being a blogger.
Because I’m pretty sure there were more blogger participants than stat geek participants. So even if there were no actual difference in their prognosticating ability, the overall winner would be more likely to be a blogger (in exact proportion with their numbers). So not only can you not conclude that the winner knows best, you can’t even really make a solid conclusion about the two groups, based solely on the performance at the top. It would be more interesting to see the overall matchup, and how the two groups compared up and down the rack. If the whole group of bloggers managed to outperform the stat geeks, overall, that would indeed say quite a bit. But the comparison at the top just conjures up the weakness of anecdotal evidence.
Once upon a lifetime, I shared an office with Chris “Jesus” Ferguson when we were both Ph.D. students and he was about to make oodles of cash playing cards. He said that there might be typically 200 players at a tournament, meaning that the average player would have a 0.5 percent chance of winning. Being a top player raised that to maybe 3 percent or so. That means that the tournament winner is still pretty likely (2-to-1 or so) not to be a top-10 player. It’s going to look like an upset. But statistically, it’s not, because there are a heck of a lot more underdogs than there are favorites.
Bottom line: Does anyone have the entire results, with both bloggers and stat geeks? That would answer all of these questions and we wouldn’t have to speculate.
@Mike: The appeal of the stat geek smackdown is that the models are set in place before the playoffs, I believe. One can feed them with new data after each round, but I don’t think the models were changed once the competition began. With the bloggers, there aren’t really any well-defined models to begin with, so the competition, though conceivably more interesting to watch personally, isn’t as enlightening.
It’s a bit like comparing predictions made by astronomers with those made by astrologers, in an imaginary world where astrologers were actually kind of successful in their predictions. It might be interesting if the astrologers were to outperform the astronomers, but not as informative or useful, because the astronomers use a public model, whereas astrologers are black boxes. Once the astrologer dies, their skill or talent dies with them, but the astronomical model can be worked on and refined by others.
That’s not really meant to compare the sports bloggers to astrologers, but to point out that although personal glory is nice, the real benefit of the stat geek smackdown is that it encourages the stat geeks to improve their models and our understanding of what goes into a championship-winning team.
@k: I hope you’re joking/trolling. Since ANY result can always be explained by SOME ad hoc explanation, the existence of an ad hoc explanation suiting the actual result is very unconvincing. It would have to have been more compelling before the fact than the null hypothesis–which is, essentially, stuff happens, and by stuff, I really mean something else. You can’t use a result to justify an explanation that was only suggested by that result. (Well, you can, but you’ll only convince yourself–a very dangerous state of affairs for any person aiming for rationality.)