In The Wages of Wins we use just a little bit of statistics. Much of the book is still friendly, easy to follow, words. But every once in awhile we let it slip that as professors of economics – who all have taught econometrics – we are basing our conclusions on our statistical analysis of the data.

One of the conclusions that we reach is that money cannot buy love in baseball (or football, or basketball, or hockey). This conclusion is based on the ability of relative payroll to explain the variation in winning percentage. In simple words, payroll doesn’t seem to tell us much about wins.

Recently, some individuals who claim to have knowledge about statistics have questioned this conclusion. Specifically – and this is where this post gets a little technical – people have questioned the use of the coefficient of determination – otherwise referred to as r^{2}. These individuals have suggested that using the correlation coefficient – otherwise known as r – is a more “real-life” statistic to use in looking at how payroll and wins are related in Major League Baseball. As you can guess, we disagree. Here’s why.

Let’s begin with the evidence. From 1988 to 2006 the correlation coefficient between relative team payroll and winning percentage is 0.43. In The Wages of Wins we chose to report the coefficient of determination – or the correlation coefficient squared – and that is 0.18. Which statistic gives us the best picture of this relationship?

There is actually a problem with drawing conclusions from the correlation coefficient. In the words or R.J. Rummel, who provides an excellent tutorial entitled Understanding Correlation:

“As a matter of routine it is the squared correlations that should be interpreted. *This is because the correlation coefficient is misleading in suggesting the existence of more covariation than exists,* and this problem gets worse as the correlation approaches zero.” (emphasis added).

In essence, the correlation coefficient exaggerates the relationship between any two variables. That is why we employ the coefficient of determination.

It is important to understand how this statistic is interpreted. An r^{2} = 0.18 means that across our sample, 18% of the variance in the two variables is in common, or 82% is not in common. So given an r^{2} of 0.18 we conclude that 18% of the variation is explained by the two variables.

From this we conclude that the relationship between relative payroll and wins is quite weak. Of course we admit that there is no common level of explanatory power that is accepted. In other words, one could come back and say that 18% is quite large. And relative to the NFL where the explanatory power is less than 5%, maybe it is. Still, we do not think 18% is anything to hang your hat on. Simply put, payroll does not explain much of wins in Major League Baseball and therefore, the evidence tells us that teams cannot simply buy wins in baseball.

– Stacey

*Baseball Stories*

Beamer

November 7, 2006

The problem is that you don’t really answer the question as to why rsq is better than r. This blog (http://sabermetricresearch.blogspot.com/2006/11/wages-of-wins-on-r-and-r-squared.html#links) makes a compelling case why you are wrong … I don’t think it is enough to say so-and-so prof thinks we are right therefore we are right. You should be able to argue the point with sheer logic.

Beamer

November 7, 2006

Blog lonk from post above

Alex

November 8, 2006

Yeah. The point on the blog you link to is right. 1 std dev change in salary (25m) equates to 0.43 std deviation in wind – 5 wins. In my view that it fairly significant.

dberri

November 8, 2006

Beamer and Alex:

Let me come to Stacey’s defense. Much of what you linked to is — to put it mildly — flawed.

For example, let’s just focus on the following statement from the article:

“The 18% makes sense only in context of the squares of what you’re actually trying to measure. If salary explains 42% of performance, then salary explains 18% of performance squared. But we sabermetricians don’t care about what performance squared; we care about performance. And that’s why the .42 is more meaningful than the .18.”

It is incorrect to say that salary explains 18% of performance squared. That is not how r-squared is interpreted. R-squared tells us the percentage of variation in the dependent variable that is explained by the model. At the very least, you need to know what these terms are if you are going to enter into this kind of a discussion.

Let me explain our argument this way. What we wish to know is the predictive power of relative payroll. Our analysis indicates that relative payroll explains 18% of the variation in wins. In simple words, it has fairly weak predictive power. Unless you can show that by knowing relative payroll I can predict wins with accuracy, this discussion is going nowhere. I am sorry if you cannot see the “sheer logic” of this argument.

Beamer

November 8, 2006

David

My first comments may have been a little facetious. Apologies if they came across as such. Let me try to explain in a bit more detail — and hopefully you’ll be able to point out when I am wrong.

I agree with the point that you make about what r squared means. As you say it is explains the variance in the model and to talk about performance squared isn’t technically accurate. I believe that Phil used that language to try to explain to the layman why, in his view, R squared was not the appropriate measure.

I do believe though — and please correct me if I am wrong — that Phil’s main point still stands. That if you look at R then 1 std dev increase in payroll ($25M) is equal to 0.43 std dev increase in wins (4.3). This is indisputable. So the question is is this significant? One could argue that yes it – an 85 win team can become a 90 win team simply by spending another $25M — a couple of big name free agents, perhaps.

On the other hand (doesn’t every good economist have 2 hands?) $25M is a lot of money, and on a 162 game schedule we’d expect a std dev of 6.3 games. This means that the variance we’d expect from luck is greater than the variance we’d expect from spending an extra $25M.

This shows through in the correlation calculation.

I guess it depends on how you define significant. You say it is not; Phil says it is. Personally, I don’t think the argument is as clear cut as either of you believe. Money is probably somewhat important in determining wins, but by nowhere near the dominant factor.

Apart from the conclusion I think believe the 3rd from last paragraph is basically your view.

JavaGeek

November 8, 2006

Beamer: I think your analysis on luck is incorrect. Luck exists whether you spend or don’t spend and will be on either side of the regression, this is just the natural distribution of errors (You can never get rid of this error). You still get the same number of wins by spending the money (an unlucky bad team does a lot worse than an unlucky good team). Luck just makes the r-squared smaller. If there was 95% luck the best regression would have an r-squared of 5%.

“I guess it depends on how you define significant. You say it is not; Phil says it is. ”

This is why statisticians use the coefficents and not the r,r-squared values to determine if a variable is significant. See each coefficent has a certain std-dev and if it includes 0, you cant conclude that it’s any different from 0 for a certain amount of confidence (95% for example). I’m not sure how the r-squared fits in here, but the r-squared doesn’t mtter as much as this confidence intervals do.

But remember hypothesis tests “do not reject” rather than “accept” the possiblity. Therefor failing the hypothesis test doesn’t mean that there isn’t a relationship it only means there’s insufficient evidence to conclude there IS a relationship.

Of course we haven’t even started discussing normallity of error and other properties that are likely violated in a regression on salary and wins.

dberri

November 8, 2006

JavaGeek,

No one is proposing that we use r-squared to test statistical significance. And no one is disputing that payroll is statistically significant with respect to win. What we are arguing is that payroll does not allow one to predict wins with great accuracy.

What is forgotten in this entire discussion is the question we were addressing in The Wages of Wins. Bob Costas stated “The fact is, the singled biggest indicator of a team’s opportunity for success from one year to the next is whether the team has a payroll among the top few teams in the league. Period.” It is the sentiment that Costas expresses that we examined. And from our analysis, it is difficult to believe that a factor that only explain 18% of wins is the determinant of success in baseball.

Guy

November 8, 2006

So let’s look at the Costas statement. The top 6 teams in payroll account for 11 postseason appearances over the past 3 seasons, getting into the playoffs about 2 years out of three on average. The bottom 6 payroll teams account for zero playoff appearances in that period (and the bottom 11 payroll teams have just one playoff spot over 3 seasons). If we define “success from one year to the next” as making the postseason with some frequency, I’d say Costas’ statement stands up extremely well.

All wins are not created equal. We don’t really care if our team wins 69 games or 79 games. What fans care about is whether their team can make a run at the playoffs on a somewhat regular basis, and get there a non-trivial portion of the time. Clearly, a high payroll greatly increases a team’s chances of doing that. Neither r nor r2 really get at the issue very well (although payroll and playoff appearances have a not too shabby r of about .64).

I suppose you could argue that avoiding a humiliating sub-70 win season is also a kind of success. But in fact there is also a very strong correlation between having an extremely low payroll and that kind of sustained futility.

Can mid-range payroll teams succeed? Sure. But there appears to be athreshhold of around $60M below which success is nearly impossible. And a payroll of about $90M or more gives a team a far-above average chance of success. If you define “success” correctly, then the notion that payroll is not an important determinant of success in baseball is simply not plausible.

Beamer

November 8, 2006

I think that it ultimately comes down to the question that you are trying to answer. If the question is: If I build a statistical model between wins and payroll how much can the model explain the relationship then the answer is as the author says. If the question is how important is this then Phil we need to understand the effect that payroll has on wins.

The effect in this case is that 1 std dev change in payroll (25M) = 0.43 std dev in wins (4.5 wins). Now we can debate whether this is important or not. I, and many others in the baseball sabermetric community, would argue tha t it at the very most it isn’t unimportant. Whether it is the single most important factor is, again, an open question. Because the baseball pay market is slightly ficticious becuase there are a whole host of young players who don’t qualify for market salaries it is hard to make a direct link. If this restriction didn’t exist the relationship would be a lot stronger.

Now Guy’s question is: Does adding Payroll buy success? This is, again, different from the question the authors pose but is probably more important. The analysis here would seem to suggest yes! Guy’s question is, in my mind, the right question, which is not to say that the question the authors pose is invalid.

***

JavaGeek,

In statistical parlance what I was getting to was this:

R = true variance / observed variance

randomvariance = 6.3 wins

observed variance = 9 wins (or thereabouts)

true variance = sqrt (9^2 – 6.3^2) = 6.4 wins

R = 36 / 90 = ~0.42

JavaGeek

November 8, 2006

You conclude r-squared is too small to conclude it is THE determining factor. Presumably THE determining factor is the factor that explains the most variability compared to alternative variables. It would seem trivial to disprove a variable is THE determining factor simply by finding a factor that is better at determining wins (correct?) However, instead you conclude that the r-squared is such that it explains less than 1/5th the variability, so there “should” exist some variable that does better, now if we knew it was a 5 or 6 variable model this would be a reasonable conclusion.

Is it possible to have a small r-squared and yet be the “most important variable”?

Let’s say we have a perfect model with 200 variables (no cross correlations) each variable would explain roughly 0.5%. A variable like spending that explains 18% would be a lot better than all those other variables and would easily be considered the “determinant of success” would it not? Even with its low r-squared spending would be THE “determinant of success” in this model.

In other words the only reasonable way to conclude it isn’t THE “determinant of success” is to find that variable that does better. Good luck.

Guy

November 9, 2006

“Now Guy’s question is: Does adding Payroll buy success? This is, again, different from the question the authors pose but is probably more important. ”

John: Assuming that David has quoted himself correctly above, I don’t see how it could possibly be different from the issue the authors are trying to address!

BTW, this same problem of choosing an inappropriate metric applies to the authors’ focus on the SD of winning percentage as a way of measuring league parity. The concern that is raised about MLB–correctly or not– is that a few large market teams dominate year after year, while some small market teams have almost no chance to make the playoffs. If fans were saying “there are just too many .400 and .600 teams in baseball, it would be better if most teams were closer to .500,” then looking at SD of win% would make a lot of sense. But of course, no fan says that. When you choose the wrong metric, the resulting analysis is not necessarily wrong in a technical sense, but totally beside the point.

Beamer

November 9, 2006

Guy — point taken. The authors answer is only right if the question is: if I build a model that correlates wages and wins how much of the variance between these two variable can be explained by the model. This is an irrelevant question.

One common error that many economists seem to make when analysing baseball is that they think regression is the answer to everything. It rears its head if you try to work out LWTS through regression. Doing that can give incorrect values for low frequency play eg, a triple. Regression is a useful tool but is seldom the last word on a subject and when it is used it *must* be interpretted correctly

JCB

November 9, 2006

No, it’s not. It’s interesting that more of the variance is produced by OTHER FACTORS, including random variation.

A corresponding error in the sabermetric community is the strange bias against regression analysis where it is the appropriate tool. For estimating run values it is not. LWTS is better and good academic sabermetricians are aware of this. Just see Albert and Bennett’s book baseball statistics, Curve Ball. In this case, regression is the appropriate tool. Let’s not throw the baby out with the bath water.

No one is denying the positive correlation between salary and winning, just that other factors seem to play a larger role. This finding is consistant with the regressions presented in Zimbalist’s “May the Best Team Win” and fits with what I have found in my own studies.

Anonymous

November 9, 2006

Is there a reason why comments on this blog that make valid points are selectively deleted?

Gary Graul

November 21, 2006

High payroll teams are almost always winners; low payroll teams are almost always losers. Case closed.

personalized teenren jewelry box

October 23, 2008

fbqinwm cvagj hkui tudhfez