Feng: Is the Home Run Derby slump real?
Tonight, eight of Major League Baseball's top sluggers face off in the Home Run Derby. The conventional wisdom says these players will slump after their participation in this event.
There are a few possible reasons for this decline in power numbers. Some have suggested the Derby requires a different approach from a regular season at-bat. Hitters use an all out upper-cut swing to maximize the chance of a home run, which might mess up their swing the second half of the season.
Or maybe these professional baseball players just get tired from taking an extra 50 swings under pressure. The season is 162 games long, and an extra batting session just adds to the fatigue.
When Tony Paul asked me to look into this Home Run Derby curse, I was 95 percent sure it didn't exist. These reasons seemed inane.
However, I was wrong. Yet there's a satisfying answer behind the curse. Let's take a look at the statistics.
The numbers behind the Home Run Derby slump
Players do hit fewer home runs after participating in the Home Run Derby.
Joseph McCollum and Marcus Jaiclin, two math professors, performed a study on all Derby participants from the start of the competition through 2009. They looked at home run rate, or home runs divided by plate appearances, both before and after the All-Star break.
While Derby participants had a home run rate of 6.1 percent during the first half of the season, this rate dropped to 5.3 percent for the second half. They found this result to be statistically significant, which means there is almost no chance that this drop happened because of randomness.
In addition, these same power hitters had the same home run rates, about 4.2 percent, the first and second half of the season in which they didn't participate in the Home Run Derby. Note that this also implies that these hitters got selected for the Derby due to higher-than-average home run rates during the first half of the season.
The power numbers for Derby participants does drop off in the second half of the season. However, there is a simple explanation for this drop that has nothing to do with the exhibition during the All-Star break.
A textbook case of regression to the mean
To understand why power numbers decline for Derby participants, consider Albert Pujols, one of this year's participants.
Pujols has hit 26 home runs this year. With 359 plate appearances, he has a 7.2 percent home run rate, a significant improvement over his career rate of 5.6 percent before this season.
It's unlikely that the 35-year-old Pujols became a better home run hitter since last season. His first half power numbers were most likely a statistical fluke, and you should try to avoid seeing patterns in this randomness.
Pujols' home run rate will regress to the mean during the second half of season. This means that he got lucky with his first-half power numbers. For the remainder of the season, Pujols will most likely hit home runs near his career rate of 5.6 percent or perhaps less if you consider his age.
Random deviations in home runs
We can go back to the study of McCollum and Jaiclin for more evidence that the Home Run Derby slump is simply regression to the mean.
Remember, regression to the mean occurs when a player has a statistical fluke in power numbers for half a season. This fluke could occur during the second half of these seasons just as often as the first.
McCollum and Jaiclin took the same set of Home Run Derby participants and looked at years in which they had their highest home run rate in the second half. These were not years in which they participated in the Derby.
In these second-half seasons, these power hitters had a home run rate of 6.0 percent, very close to the 6.1 percent these players hit in the first half of their Derby years. These players had a home run rate of 5.4 percent in the first half of these seasons, again very similar to the 5.3 percent rate in the second half of Derby seasons.
This means that these anomalous half seasons could also occur during the second half of the season, with a similar drop-off in the other half. This suggests that these deviations in home run rate are random.
The consequences for other players
This regression to the mean explanation has consequences for other players not participating in the Home Run Derby.
For example, let's pick on Kansas City catch Salvador Perez. He's hit 15 home runs this year for a 4.9 percent home run rate. Before this season, he had a career home run rate of 2.7 percent.
Regression to the mean implies Perez will not continue his power numbers, even though he will not participate in the Home Run Derby. Unfortunately, the same conclusions also apply to J.D. Martinez.