Rocket Science: Clemens and ‘Roids

Posted on June 4, 2007 by


Another brilliant comment from economist Steve Walters of Loyola College:

I plead guilty to being uncharitably suspicious. Watching Roger Clemens go off on a young sportswriter after his last minor-league tune-up in Scranton, I wondered: “’roid rage?”

The tantrum was, at one level, merely classic Egotistical A-hole Athlete. Clemens escalated an angry denunciation of the writer into a larger rant against negativity by the press in general, playing the usual jock card about how sportswriters have never been “in the arena” and so are unqualified to ponder weighty, complex topics like… pitching. Translation: I’m way better than you; shuddup.

But the tone and direction of the tirade were so utterly inappropriate, so disconnected to the question asked, that the writer would later describe it as “downright scary.” It was—and not with any reasonable basis that I could see. And so my ungenerous imagination took over.

Which is why MLB must do far more to assure us that the guys who are performing such miraculous feats these days—and at such advanced ages—are doing so legitimately. The current testing program is a joke, and until it’s toughened up we’ll continue to have our suspicions, fairly or not. And since the program is in its infancy, there’s even greater suspicion about the legitimacy of records set a few years ago.

Which brings us to an interesting study by Yale economist Ray C. Fair, titled “Estimated Age Effects in Baseball.”

Fair took a cut at the question Bill James first asked back in the ‘80s and Jim Albert pursued in ’02 (discussed here on April 21 in “It Ain’t Necessarily So”): When do ballplayers reach their peak? He used some sophisticated statistical tools to get the basic answers (28 for hitters, 27 for pitchers) for his sample of 441 hitters and 144 pitchers who played at least 10 full seasons between 1921 and 2004.

Then he asked this follow-up: Which players have exhibited the most unusual age-performance profiles? Specifically, are there any players who got better with age?

Over the entire period between 1921-’04, Fair found only 18 hitters who appear to have defied Mother Nature, logging four or more seasons after the age of 28 in which their OPS (on-base plus slugging average) exceeded their age-specific expected level by more than one standard error. Here’s the list, ranked by the size of the largest “over-performance residual” (with the year and player’s age at the time of his greatest “outlier season” in parentheses):

1. Barry Bonds (2004, 40)

2. Sammy Sosa (2001, 33)

3. Luis Gonzalez (2001, 34)

4. Mark McGwire (1998, 35)

5. Ken Caminiti (1996, 33)

6. Albert Belle (1994, 28)

7. Larry Walker (1999, 33)

8. Dwight Evans (1987, 36)

9. Gary Gaetti (1998, 40)

10. Rafael Palmeiro (1999, 35)

11. Andres Galarraga (1998, 37)

12. Chili Davis (1994, 34)

13. Julio Franco (2004, 46)

14. Paul Molitor (1987, 31)

15. Bob Boone (1988, 41)

16. Steve Finley (2000, 35)

17. B.J. Surhoff (1995, 31)

18. Charlie Gehringer (1939, 36).

Notice anything? According to Fair’s analysis, Mother Nature apparently lost her grip starting in the ‘90s. Only three of the guys on this list (Boone, Evans, and Gehringer) played mostly before the ‘90s, and Fair observes that their performance residuals really don’t show the same pattern as the rest of the players on the list; in other words, they’re more likely to be just late bloomers.

But the rest might be, if not suspects, “persons of interest” in an investigation of the effect of performance-enhancing drugs on baseball since the early ‘90s. One of the players on the list has tested positive for ‘roids; another (according to illegally-leaked grand jury testimony) admits to having briefly used them, though unwittingly. Fair concludes that “since there is no direct information about drug use in the data…, [these findings] can only be interpreted as showing patterns for some players that are consistent with such use, not confirming such use.”

What about Clemens? Since the pitchers’ sample was much smaller, Fair didn’t examine it for unusual aging patterns. Given the greater variance in pitching performance, he probably thought no pitchers would satisfy his criteria for “suspect residuals.”

But I punched the appropriate parameters for Clemens into a spreadsheet nevertheless, and found that the Rocket is just a single year of dramatic over-performance shy of crossing the “Fair threshold.” By my calculations, Clemens’s performance during his presumed “decline phase” exceeded the Fair model’s age-specific prediction by more than one standard error in 1997 (at age 34), and in both the ‘05 and ’06 seasons (when he was 42-43). If he puts up an ERA below 3.15 in what’s left of ’07, he’ll be in the company of statistical outliers like Bonds, Sosa, Caminiti, and Palmeiro.

Of course, this is just statistical doodling. One of the attractions of sport is the possibility that we’ll see something that the laws of probability say is extremely unlikely. Outliers make the games fun. And extreme outliers are rarities, not impossibilities. Their occurrence need not be considered evidence of cheating; they might simply be a manifestation of excellence.

So let’s all fervently hope that the historic accomplishments we’re witnessing in big-league ballparks these days are perfectly legit—that these guys are just freaks of nature who, coincidentally, started getting freakish at just about the same time the market for steroids got out of control.

If so, however, we’d still have to face at least one fact: Roger Clemens frequently acts like a jerk for no good reason.

–Steve Walters

Posted in: Baseball Stories