Talking with Henry Abbott and a Comment about Model Building

Posted on March 10, 2007 by


Last Thursday evening Henry Abbott of sent me an e-mail asking for my thoughts on a recent conversation Abbott had with Bill Simmons. What follows are these thoughts, which mostly center on what I think the primary task models are supposed to accomplish (hopefully this lengthy essay is more interesting than that last sentence).

Henry Abbott fights with Bill Simmons

This week listed the top centers of all-time. One of the newest employees – Henry Abbott (of the immensely useful – was one of the people asked to vote. Abbott’s top ten excluded Moses Malone, which led to a protest from ESPN’s Bill Simmons. Simmons went so far to threaten to remove from his list of favorites, which would not only be wrong but just a bit extreme.

This was revealed at TrueHoop, along with Abbott’s justification for his choice. His justification centered not so much on the merits of Malone or other players, but the immense difficulty anyone has evaluating NBA players. The following paragraph captures Abbott’s basic argument:

In Phoenix they seem to believe excellent spacing and having everyone who isn’t Steve Nash limit their dribbling is the key. In New York it looks like Jamal Crawford’s ability to get Eddy Curry easy baskets is worth enough points a game to turn a loss into a win. I could go on and on. What it takes to win is subtle and elusive, like what makes a good meal. As much art as science. Which is not to say it’s random. It’s just inscrutable. Using one players’ individual’s points and rebounds as a major tool in that debate is like using a shovel as a major tool for brain surgery: so crude it hurts.

Thursday evening I received an e-mail from Abbott, asking me what I thought of his discussion with Simmons. What follows are some semi-random thoughts that hopefully lead to a simple, yet useful, observation:

Starting in Left Field: Predicting Presidential Elections

My discussion is going to start “out in left-field” on a subject far removed from basketball, and I think, immensely more complicated. Every four years the nation chooses a President. Millions of people participate in this event, and these people consider a seemingly endless supply of factors in making this decision. The list includes the positions the candidates take on a host of issues, the style and substance of their advertising, what the candidates look like, their political and family history, etc… Each voter attaches more or less weight to each element on this list of factors in making her/his final decision. Given the millions of people involved and the host of issues any voter could or could not consider, how can anyone possibly make sense of this event?

The answer to this question can be found in a book. Before there was The Wages of Wins there was Freakonomics. And before there was Freakonomics there was a little book called Predicting Presidential Elections and Other Things. This was written by Ray Fair (a Yale economics professor), and like The Wages of Wins and Freakonomics, Fair’s book takes research published in academic journals and puts it in a form accessible to non-academics.

Although Fair’s book covers such diverse topics as predicting the quality of wine, predicting race times in marathons as a person ages, and the likelihood someone would have an extramarital affair, I want to focus on his title subject, predicting presidential elections.

For the past thirty years Fair has been offering predictions of each election based on a fairly (sorry for the pun) simple model of voting. Fair’s model ignores most of what people think is “important” about presidential politics and instead focuses primarily on the state of the economy at the time the election occurs. In a nutshell, Fair finds that if the economy is doing well at the time of the election, the incumbent party tends to win. If the economy is doing poorly, the incumbent party tends to lose. In sum, Fair has been arguing for three decades that it really is “the economy, stupid.”

It’s important to note that Fair’s simple model explains about 90% of the vote. In other words, only about 10% of the vote is explained by the factors Fair ignores. And what does he ignore? Fair ignores such seemingly important issues like who the candidates actually are, what their positions on issues might be, whether or not they are popular, etc… In fact, right now, before the candidates are even chosen, Fair can offer a forecast of the 2008 election (one can look on Fair’s website for his January, 2007 forecast which might make Republicans a bit less happy).

Now does Fair’s model tell us that candidates are not important or that campaigns don’t matter? No, but it does tell us that if the economy is on your side (growing if you represent the incumbent party, not growing if you are the challenger) your campaign is much more likely to be successful. Likewise, without the economy on your side, your campaign is likely to have problems.

Building Models and the Scoring Focus

The purpose of this post is not to discuss presidential politics. I note the work of Fair because I think his work illustrates the task models are supposed to accomplish. A model is supposed to be a simplification of reality. And why do we need simple? Because when us human beings try and make sense of our world (so we can do stuff like make decisions) we tend to take what is very complex and simplify. If we did not do this, decisions would be extremely difficult to make. Given how we make decisions, a good model is most helpful when it allows us to simplify “correctly.”

What does all this tell us about basketball? Henry Abbott has noticed, as people probably have since James Naismith transformed “Duck on a Rock” into the game we love, that basketball is complicated. Wins seem to be about a multitude of factors including scoring, hitting the boards, passing, ball-handling, defense, creating shots, making teammates better, etc… Like voting in presidential elections, the list seems endless. Confronted with this seemingly endless list people wonder what we should focus upon. In other words, what on this list is truly important?

As we note in The Wages of Wins, decision-makers in basketball have simplified this list by focusing their attention primarily on scoring. We find that scoring is the primary factor behind what gets you paid and the awards the NBA gives out. Factors like shooting efficiency and turnovers are not given much consideration. The problem with this focus is not that decision-makers shouldn’t simplify, but rather that the focus on scoring – or how decision-makers simplified — leads to some predictable errors.

A Better Simplification

To see this point, let’s note that it’s possible to simplify basketball and arrive at a more accurate characterization of player performance. The big job in this analysis was done by the likes of Dean Oliver and John Hollinger, who (as Dean Oliver notes) in turn built on thinking of such people as Dean Smith and Frank McGuire. Basically, these people noted that wins in basketball are determined by how many points a team scores and surrenders per possession. In other words, offensive and defensive efficiency are what matters.

Of course, saying this doesn’t seem to help us evaluate individual players. Just like its not obvious how an individual player impacts wins, it also doesn’t seem clear how individuals impact a team’s efficiency measures. Fortunately, with a bit of statistical analysis, we can use the relationship between wins and efficiency to ascertain the value of the various statistics teams track for individual players.

And once we know the value of the player’s stats in terms of wins, we can then learn which factors are more important and which matter less.

The list of truly important factors is surprisingly small, and hopefully somewhat obvious. The primary factors that impact outcomes in basketball are shooting efficiency, rebounds, steals, and turnovers. In other words, a team must be able to convert its possessions into points. So shooting efficiency is important (or in other words, missing shots does not help a team win). Possessions – or rebounds, turnovers, and steals – are also important. You must take the ball away from your opponent before they score. And you must avoid giving the ball away before you score. So winning the turnover battle and hitting the boards also matters.

Now it’s not the case that factors like blocked shots, assists, and personal fouls don’t matter. But none of these factors are as important as shooting efficiency, turnovers and steals. And once we see this, we can understand the outcomes we observe.

For example, the Rockets lost Yao Ming to a devastating injury, yet managed to maintain their winning percentage. Once we see the importance of rebounding, though, we can see how having an extraordinary rebounder like Dikembe Mutombo come off the bench mitigated the loss of Yao.

We also see why the 76ers improved after Iverson departed. Iverson has problems hitting shots and avoiding turnovers, so despite his scoring totals, our model tells us he does not produce as many wins as his star power suggests. In other words, we should not have expected a team that replaced Iverson with Andre Miler to get worse (as many people who focused on scoring predicted).

The Simple Lesson Learned

The temptation in doing analysis – whether it is elections or basketball – is to consider everything that anyone thinks could matter. Models, though, are not supposed to consider everything. Models are supposed to be simplifications of reality that allow us to focus on the factor or factors that truly are important.

Let me put it another way: When an analyst ignores what a model is supposed to do, and tries to tell decision-makers the value of everything, the analyst ultimately tells the decision-maker nothing.

All this being said, one should not think the work of Fair is the final word in modeling presidential elections or the best possible model anyone will ever offer. Nor should people think Wins Produced, Win Score, PAWS, etc… are “perfect” models of player performance in the NBA.

I would argue, though, that these are all “good” models in the sense that each is both a simple and accurate reflection of reality, reflections that allow people to better understand the world they observe. And ultimately, this is what we want models and analysis to do. Simplify a complex world correctly, so people can make better decisions about the allocation of resources (which is all economics is really about).

Final Thought

Of course all this ignores the really important questions. Is Moses Malone one of the top ten centers of all time? And why did Bill Simmons – a life-time Boston fan – get that upset when Abbott left Malone off his list? After all, Malone never played for the Celtics. Perhaps this is something to discuss in the comments.

– DJ