This column is over 2,000 words. So let me apologize up front for wasting your time today with way too many words. If you do manage to get to the end of this post, you will see the following:
– a list of the NBA’s Best Teams in 2006-07
– a discussion of my efforts to predict the playoffs for True Hoops
– a review – built upon the work of Stephen Dubner – of The Black Swan by Nassim Nocholas Taleb.
– an possible explanation for why Chicago failed to defeat Detroit in the playoffs
– a link to my latest column for The New York Times
– and a quick poll
Again sorry for the length, but I got a bit carried away last night.
The NBA’s Best and Predicting the Playoffs
Continuing yesterday’s Sinatra inspired theme, I continue looking at “some nice things I’ve missed” by turning my attention to the NBA playoffs.
As I joked last month, the Iverson forecast was the first time an economist ever had a prediction come true (not a very funny joke, but a joke nonetheless). Henry Abbott of True Hoop has decided to put my power to predict to a test. Specifically, Abbott has asked a collection of “stat geeks” to forecast the playoffs in the NBA.
Before I get to the problems with this test, let me reveal my “brilliant” strategy. The following table reveals all the data I considered.
Table One: The NBA’s Best Teams in 2006-07
Yes, all I considered was each team’s efficiency differential. And the further apart the teams, the shorter I expected the series to be. Consequently, I took Chicago over Miami (more on this series in a bit). I also took Chicago over Detroit and Dallas over Golden State. The problem with this strategy is that it ignored the fact New Jersey had a healthier Richard Jefferson to combat the Toronto Raptors. In other words, the health of the players at the time the playoffs started was not considered. Given my time constraint, though, my simple strategy was the best I could do.
So far my strategy has not done too badly. Heading into the conference finals I am three points off the lead. Unfortunately, it seems very unlikely that I will win the coveted blazer Henry Abbott is bestowing on the winner.
The two leaders are Jason Kubatko of Basketball-Reference.com and Kevin Pelton. Everyone picked the Spurs and Pistons to win in the conference finals. Pelton and I do differ, though, on how long we guessed the series would last. So if the Spurs win in five and the Pistons win in seven, I will pass Pelton. Kubatko, though, will still be three points ahead because we picked each series to last the same number of games. My only hope is to differ from Kubatko in picking the Finals. And I think, as I will explain in a moment, that this would require that I pick against my “brilliant” strategy.
Why do I have to pick against my strategy? The closeness in the standings reveals one of the difficulties with this contest. It turns out that the “stat geeks” tend to see teams the same way. In other words, we tend to agree that teams are best evaluated in terms of offensive and defensive efficiency. Consequently five out of six of the contestants picked Chicago over Miami. Four out of six took Chicago over Detroit. And as noted, all of us picked Detroit over Cleveland and the Spurs to defeat the Jazz. My sense is that Kubatko and I will reach the same conclusion about who should win in the Finals. So I will have to pick the statistical “underdog” if I am going to win this contest.
Beyond the issue of the stats people seeing teams the same way, this contest also has one other problem. Basically the playoffs suffer from the classic “small sample” problem. As the Dallas Mavericks discovered, the best team does not win every game and it’s possible for an inferior squad to win a best-of-seven series. Statistical analysis doesn’t work very well with small samples, so predicting the playoffs is not a very good test of what we do.
All that being said, it doesn’t mean the contest isn’t a great deal of fun. It certainly has given me a rooting interest in every series, which makes the playoffs that much more enjoyable.
The Black Swan
The small sample nature of the playoffs doesn’t prevent people from trying to “explain” what we observe. For example, much has been written about various moves the Mavericks should make after the Warriors eliminated the team with the best regular season record (although not the best efficiency differential) in the first round. Or what the Bulls or Suns should do to get past the second round of the playoffs. Such explanations remind me very much of the arguments advanced by Nassim Nocholas Taleb.
One of the perks of writing a book is that publishers now send me advanced copies of other people’s books to review. Consequently, I was given an advanced copy of The Black Swan, Taleb’s latest book published by Random House. Taleb’s first book was entitled “Fooled by Randomness.” This first book, which was excellent, detailed the problem people have forecasting on the stock market. The Black Swan is basically a sequel to Fooled by Randomness. Although I could offer a detailed review, Stephen Dubner – of Freakonomics fame – saved me the trouble by writing a review that captured my basic sentiments.
… you will probably enjoy the writing of Nassim Nicholas Taleb, a polymathic gentleman whose new book is called The Black Swan: The Impact of the Highly Improbable. Here’s how its dust jacket succinctly describes the thesis: “A black swan is a highly improbable event with three principal characteristics: It is unpredictable; it carries a massive impact; and, after the fact, we concoct an explanation that makes it appear less random, and more predictable, than it was. The astonishing success of Google was a black swan; so was 9/11.”
I am about a third of the way through The Black Swan, and am finding it to be one of the most fun and challenging books I’ve read in a long time. It barrels its way through history, psychology, philosophy, statistics, etc. You find yourself arguing Taleb every third sentence or so — but, to me, that is part of the great fun. He is a brash, stubborn, entertaining, opinionated, curious, cajoling writer and thinker. I also very much liked his previous book, Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets. The two books have similar worldviews but it is worth reading both of them. (FWIW, based on what he’s written, I think he would probably hate Freakonomics, and he explicitly hates the financial end of economics.)
I’m sure that some of you have already read The Black Swan, since it is a New York Times best-seller. If not, you can learn quite a bit about Taleb from his home page, as well as from this very good profile that Malcolm Gladwell wrote a few years back.
Like Dubner, I am a third of the way through the Black Swan. I also had the same impression. This book is a poke in the eye to those who think the analysis of stats provides the answer to everything. And this book forces us to ask if our explanations of the past are simply “concoctions” that allows us to believe the future can be predicted with great accuracy.
Defying the Black Swan
Although I agree with Taleb that we tend to “concoct” explanations of the past, this doesn’t stop me from “concocting” here and there. For example, when we look at offensive and defensive efficiency we see that Chicago was a “better” team than Detroit this year. Yet the Pistons defeated the Bulls in six games. How can this be explained? Well, let me “concoct” an explanation.
In the first round the Bulls faced the Miami Heat. Chicago probably thought they were facing the defending champs. Unfortunately, Miami had a big incentive problem. As I noted two weeks ago, players do not get paid much for the playoffs. So the only purpose in playing is to win the games. But the Heat just won last year, and since championships won (like most everything else) are characterized by diminishing returns (the first one makes you happier than the second), the Heat were not as driven to win in 2006-07. Plus many of their players are old and/or hurt, and they knew given their status that winning this year was a long-shot. Given all that, the Heat may not have been trying that hard in the first round.
But the Bulls probably thought the Heat were giving it their best shot. And when the Bulls won so easily they might have thought the playoffs were going to be easy. Then the Bulls faced the Pistons. The Pistons are not old and hurt. They also have something to prove (can they win without Big Ben?). And in the first two games the Pistons showed the young Bulls how hard a team can play in the playoffs. By the time the Bulls got the message, Chicago was in too big a hole to recover.
All that being said, what I just said is probably just nonsense. Again, there is the whole Black Swan issue. People look at the outcomes in the playoffs and then jump to conclusions. For example, people have argued that San Antonio defeating Phoenix is “proof” (a word that should be removed from every sports writer’s vocabulary. There is very little that is “proved” in sports) that how Phoenix plays in the regular season can’t work in the playoffs. When we look at each team’s efficiency differential, though, we see that the Spurs were the better team in the regular season and thus should have been favored.
That doesn’t mean, as the Warriors demonstrated, that the Spurs were guaranteed a victory over the Suns. The efficiency story, though, does tell us that the Suns should have expected their season to end in mid-May.
Although upsets can happen, your best bet in the playoffs is to be the better team. So going forward the Suns simply have to keep trying to build the best team possible (and getting rid of Shawn Marion is not likely to be a step that is consistent with the objective of building the “best” team possible).
Writing for The New York Times
Okay, I have said in a span of over 1,000 words: “Playoffs defy prediction” and the “Playoffs are predictable.” One of those views has to be right.
Two weeks ago I wrote another column for The New York Times entitled “The Short Supply of Competitive Balance” that you may have missed. And this column seemed consistent with the “playoffs are somewhat predictable” sentiment.
As this column notes, the NBA championship has only been won by a handful of teams in the past few decades. And if either the Spurs or Pistons win again, the title will once again reside in a familiar place. In this column I utilize “The Short Supply of Tall People” argument detailed in The Wages of Wins to explain why the NBA lacks competitive balance. For those who do not have insider access to The New York Times, you can see the same argument in The Wages of Wins or in the following posts:
The Short Supply of Tall People
The Changing Fortunes of Jamal Magloire and Zach Randolph
Quick Poll
Okay, now that I am back there are a number of topics to write about. What would you like to me write about first? Here is a list of potential topics:
–Who is the MVP in the NBA (if Wins Produced is your criteria)?
–Who is the NBA’s best sixth man?
–Who was most improved in the NBA in 2006-07?
–How does experience in the NBA impact player performance?
Let me know in the comments which of these four topics you want to hear about first. My plan is to get to these four topics in the next week or so. By then we should be closing in on the Finals. During the Finals I will repeat what I did last year. Specifically I will demonstrate again that I really don’t get the Black Swan argument by offering analysis of each and every game. Hopefully this analysis will be chocked full of “concoctions.”
– DJ
JChan
May 22, 2007
I guess this is a big reason why I never take any predictions very seriously, whether they are others’ guesses or my own. But you’re right, is does make watching the games more fun. I was pulling for Chicago to be the first NBA team to ever come back from a 3-0 deficit, just because I predicted Chicago would win.
As for topics, I like the “experience in the NBA” topic. One thing I think would be interesting (although it may take a lot of time) is research into the differences between regular season performance and playoff performance. Especially in the upsets of this year. Does having more experienced players help a team play better in the playoffs?
Evan
May 22, 2007
4, then 1.
disappointmentzone
May 22, 2007
Experience. No doubt.
nick
May 22, 2007
who is mvp
Guy
May 22, 2007
Re: The Short Supply of Competitive Balance, I’m not sure why it is we should believe that basketball teams are more unequal in ability than teams in other sports. The range of scoring by NBA teams, for example, is much more narrow than for baseball teams, compared to the respective league mean scoring level. The SD/mean in the NBA is around .04, compared to .07 in baseball. The same is true on defense.
Of course, in the NBA an equal proportional advantage in scoring translates into many more wins. For example, Phoenix outscored its opponents by 7.1% (+7.3 points), for a .744 winning %. In baseball, a team that outscores its opponents by 7% (about .3 runs/game) will have a winning % of only .532. But this difference obviously results not from talent disparities in the two leagues — which in this example are identical — but is purely a function of the structure of scoring in the game (especially the fact that basketball teams have far more scoring opportunities).
Other than the variance in team win% — which can’t really tell us much — is there other evidence of greater talent disparities in basketball than other sports?
Owen
May 22, 2007
Definitely experience, with an addendum about age,, since so many players come out after their freshman year, while others come out after their senior years or come from Europe.
Specifically, I would love to know the probability that Eddy Curry’s performance has peaked. It would be great to have some ammunition to fire at the argument that “hey, he’s only 24,” even though he will be entering his seventh season next year.
Charles Duggleson
May 22, 2007
I think part of the reason why we end up with perfect hindsight vision is because most of us have indeed discovered we overlooked an important factor before the fact. Happens.
We should try to see if there was a factor we missed the first time round. That factor may’ve been missed due to simple human error (perhaps pressed for time?) or due to inexperience. It does happen.
We shouldn’t just throw up our hands and say, “oh well, my theory was upset by a random black swan” and call it a day. The impulse to check the tape so to speak is a good and wise one.
But, yes, sometimes you have to admit there was no way to reasonably predict otherwise given the (assumed) reasonable dataset on hand at the time.
Anyway, my two cents on the order:
MVP
Experience on player performance
Sixth man
Most improved
Brian
May 23, 2007
A Request: Mr Berri, in light of the NBA Draft Lottery last night, do you think that you could some data from the last 5-10 years about how WP48 can be used to predict NBA success?…I ran some numbers and it seems that for wings and point guards, win score from their last year in college is a good indicator, as well as seeing marked improvement from year to year. However, for PF’s and C’s, freshman year WP48 is a FAR better indicator than from senior or junior year.
The Franchise
May 23, 2007
Guy-
Points in basketball are different from runs in baseball, though. Baseball is much lower scoring, so it’s rare for a teams to win a game by less than 10%, making such a comparison less useful.
Okapi
May 23, 2007
(1) The ex post stories you concoct for the play-offs seem like they could be objectively tested (rather than just believed)…
As a test of your “incentive” hypothesis (used to explain Miami’s loss): historically have teams had less of a tendency to repeat as champions than the quality of the team (e.g. regular season record) would have suggested.
As a test of your ease-of-1st-round-win hurting 2nd-round-chances hypothesis: historically have teams that surprisingly (perhaps benchmarked against Vegas odds) won easily in round 1 had a tendency to do more poorly in the 2nd round than the quality of than the quality of the team would have suggested.
(2) Phil Birnbaum seems to have a good potential criticsm of your competitive balance argument– http://philbirnbaum.com/wagesofwinsreview.pdf
(Perhaps your thesis stands even after controlling for the impact the structure of the game has on the dispersion of win totals. But his objections seem at least superficially valid. )
dberri
May 23, 2007
Okapi,
I am sure I have discussed this before. Still, let me quickly note again the problem with the “structure of the game” arguments for competitive balance. Competitive balance in baseball improved dramatically in the 20th century. If structure of the game drives competitive balance, then how could balance change in baseball (when the game hasn’t changed)? I would add, we see the same pattern in football and hockey. Balance seems to improve over time. This is consistent with the population hypothesis. We see less improvement in basketball, primarily because the populations drawn upon in basketball remain extremely low.
Guy
May 23, 2007
Franchise: Um, yes, that was my point. Each game is different in terms of how efficiently it translates talent differences into W-L differences. Being 7% better in the NBA means you win three-fourths of your games, while in MLB it means winning a little more than half. That’s why you can’t use rely only on variation in team win% — as Dave does — to determine how big the talent spread is in a league.
Dave: Population and game structure are not competing explanations for the level of competitive balance in a league. Both clearly play a role. (And Birnbaum is certainly not arguing that player talent variation is unchanging.) Competitive balance will reflect three factors in combination: 1) spread of player talent, 2) team construction — how evenly/unevenly talent is distributed among teams (e.g. salary caps, free agency), and 3) game structure, which determines how efficiently talent differences translate into wins. So differences in team win variation can’t tell you anything about player talent differences unless you can control for other two factors.
Within a single sport, if game structure doesn’t change, then yes, reduced variation of team win% must mean a change in talent and/or team construction. Talent distribution in baseball certainly has narrowed, as you say (but team construction has also changed).
But when comparing sports, the game features are likely more dominant. If game structure isn’t important, then why is it that the best season ever by the most dominant team (Yankees) is just .714, a number eclipsed routinely by NBA teams? NBA teams post win% below .250 all the time, but it has never happened in baseball, even when talent differences were greater. Why does a 7% scoring edge make you a dominant champion in one sport, but an also-ran in the other? Only game structure can explain these differences.
Now, has the NBA seen LESS narrowing of talent differences over the past 50 years than other sports? That’s an interesting question.
It seems pretty clear that since the days of Wilt and Russell the range of talent at center has narrowed at least as much as in other sports. But maybe at other positions that’s not the case. But you would have to use PLAYER-level data to answer that. A lot of work has been done on the shrinking COV for baseball players (I don’t know about other sports), so it should be possible to compare NBA players. But until someone looks at player-level variance, we can’t say whether the NBA has been an exception to the general trend of improving talent and reduced variance.
Okapi
May 23, 2007
Dberri,
Thank you every much for your reply to my comment (and also for your always interesting posts.) I couldn’t recall whether you had looked at a time series of competitive balance information. And of course Phil Birnbaum’s “structure of the game” objection would only be relevant if you weren’t looking at the development of balance over time. (Or if you were and it didn’t exhibit the pattern you just mentioned.)
This brings up a related topic I’m considering testing in homage to Gould. Does the number of players as a percentage of the population of potential players correspond with the variability of player statistics in major categories (e.g. batting average as he so famously suggests)? I now find your competitive balance argument to be compelling so maybe as a corollary I’ll test whether Noll-Scully is correlated with the variability of player statistics.
–Okapi
James
May 23, 2007
I think your column misses a key point of The Black Swan, which I read a couple of weeks ago. Basketball wins make up a function with a normal distribution, like the distribution of tall people. Therefore, the whole theory of The Black Swan, which applies only to “scalable” or fractal distributions, doesn’t really apply to it. Yes, statistically improbable events can still happen in basketball. Golden State can beat Dallas in a seven-game series. But there is no true “long tail” in basketball, or else you’d see teams with 79 wins (and teams with 1 win).
dberri
May 23, 2007
James,
I completely agree with you on the general applicability of the Black Swan idea to sports, which is why I am not too bothered by Taleb’s general attack on economics (which, again, doesn’t seem to apply to what I do). But his point that we go back and try and explain every past event does seem to apply. No matter what happens in sports, pundits have a detailed explanation. And often those explanation are just “concoctions.”
Guy
May 24, 2007
I notice that WoW relies on the Noll-Scully ratio of observed SD to ideal SD to measure competitive balance in each sport. However, I don’t believe the ratio of the two SDs will tell you the true variance in team strength within a league, which is what you want to know. We know the relationship between observed SD, true strength, and random error (also the “ideal” SD), which is SD(observed)^2 = SD(true)^2 + SD(error)^2. So, SD(true) = sqrt (SD(obs)^2 – SD(error)^2), for which the Noll-Scully ratio of SD(obs)/SD(error) will not be a good approximation.
The ratio will tend to understate the real variance in leagues with a small number of games. For example, the “ideal” SD is .039 for MLB, vs .125 for NFL. So, to match baseball’s 2.10 Noll-Scully ratio, football would need an observed SD in win% of .263 (huge). But in that scenario, the NFL true strength variance would be more than three times greater than MLB’s (.231 v .072), despite identical Noll-Scullys.
continued…..
Guy
May 24, 2007
….continuing….
So the Noll-Scully ratio is not a good proxy for true strength differences among teams. However, we can use the ratios Dave reports to reverse engineer the observed SD, and then calculate the true strength differences in leagues:
(Noll-Scully, True strength SD)
MLS: 1.38, .071
NFL: 1.56, .150
MLB: 2.10, .072
NBA: 2.54, .129
So, soccer and baseball actually have almost identical levels of competitive balance. The NBA and NFL have much larger spreads of talent, but interestingly, it is the NFL (.150), not the NBA (.129), that has had the largest team strength variation. (Although, in the 1990s the NBA was slightly larger). And in fact, we can observe that great NFL teams will often be over .800, while great baseball teams are just .650 — the NFL is not highly competitive in this sense.
And since the NFL is the most unbalanced support, I suppose we’ll soon be hearing about “The Small Supply of Big People”!
Guy
May 30, 2007
BTW, if anyone wants to convert a Noll-Scully ratio (NS) into the correct measure of team talent dispersion, the simple formula is SD(true) = SD(error)*SQRT(NS^2-1). And SD(error)=.5/SQRT(n) where n = # of games played in a season.