Thursday, March 12, 2009

How Does the WBC Affect Players' Regular Season Performance?

Several teams have discouraged their big name players from participating in the World Baseball Classic and many others, most notably the New York Yankees, have made public their displeasure with the WBC taking their stars away from them for a portion of spring training. They and many other WBC detractors worry that the regular season performance of the participants will suffer because they have been taken out of their usual spring training routine. This study examines the statistics of the 2006 World Baseball Classic participants and attempts to determine the truth of this claim.

Now, if the performance of WBC players were to suffer, it would likely occur in two key months: April, when the disruption of spring training routine may throw off their early-season performance, and September, when the WBC players may perhaps tire after a having essentially started their season early by playing meaningful March games. The mid-season months of May-August should remain unaffected since this is after they've had time to get into the regular flow of the season, but before they may tire due to extra spring games.

Looking back at 2006, I identified 63 hitters, 22 starting pitchers, and 29 relief pitchers who played in the Classic and also played significant time in the major leagues that season. For batters I then collected their OPS and number of plate appearances during April, September, and the rest of the season. For pitchers I collected their ERA and innings pitched. Using the May-August months as a control, I compared their statistics to both their April and September stats to determine if their stats really did suffer as a result of the WBC.

Doing this comparison was more difficult than it appears. Simply comparing the total OPS in April with the total OPS in May-August would be inaccurate due to the fact that if the good players tended to bat more in April than the bad ones did, the results would be skewed - a classic Simpson's paradox. Instead I took the difference between the April OPS and the mid-season OPS for each player. I then gave each player a weight dependent upon the number of PA's the player had in each period.

To calculate the weight, the standard deviation of the OPS difference for each player was first calculated. This standard deviation was calculated using the number of plate appearances in each period and the fact that the approximate SD of OPS for a single plate appearance is 1.25 (of course, the distribution is not normal, but becomes so after several plate appearances due to the Central Limit Theorem). Once the standard deviation of the difference for each player was determined, we could give each player a weight equal to 1 over the square of the SD (1/variance).

Once the weights and the differences in OPS were in place, summing the differences and dividing by the total weight gave a weighted average of the difference in OPS in April vs. the mid-season months. The data was also adjusted for the last 13 years of league-wide data showing that hitters in general perform slightly worse in both April and September compared to the mid-season months. A standard deviation of the weighted average was also calculated by taking the inverse of the square root of sum of the weights.

The Results:

After all of the calculations, what were the results? Are players really affected by playing in the World Baseball Classic? If so, who? And is the effect at the beginning of the year or the end of the year? The table below explains:

The effect for hitters and relievers is negligible, in fact both hitters and relievers saw their performance in April and September actually improve over their mid-season statistics. However, these figures are not statistically significant and are likely just due to chance. For relievers, a 0.34 ERA improvement might be substantial, but since relief pitchers throw so few innings, the standard deviation of this estimate is high at 0.60, meaning that we really can't draw meaningful conclusions from the data.

However, the real story is the starting pitchers - precisely the group that people worry about in the WBC. Among starters, we see no effect in September due to fatigue, but we see a very real effect in April, where starting pitchers' ERA rose 0.67 points. The standard deviation of this estimate is fairly high at 0.43 points, but even so, the result is statistically significant at a .06 level, meaning that their April performance would be that bad by chance only 6% of the time.

Of the 22 pitchers identified, 14 performed worse in April than the mid-season months, while only 8 improved. The most damning evidence came from Carlos Silva, Carlos Zambrano, and Miguel Batista all of whom had horrible early season performances before righting the ship the rest of the way. Of course, there were exceptions to this as well, but the majority of the evidence shows that the indeed, WBC pitchers underperformed in the first month of the season to the tune of 0.67 points of ERA. Below is a list of the best and worst performing pitchers in April in order of influence.

Why do the WBC pitchers tend to perform worse? Most likely, it's due to some or all of the reasons that the WBC naysayers claim - that it takes them out of their spring training rhythm, that it doesn't give them enough time to prepare, that they aren't allowed to stretch out properly, that they're away from their pitching coaches, and that they don't have the opportunity to work on specific pitches in meaningless spring training games. Perhaps it's also due to under work. Usually a starting pitcher requires about 30 spring training innings to stretch out, but many in the WBC worked only 5 innings during the tournament, forcing them to get in the rest of their work in a shortened spring training.

I enjoy watching the WBC and it's certainly more fun than watching a few spring training games, but in the face of this evidence, I would not want my starting pitcher playing in the WBC either. The starting pitching at the WBC is already a little thin - in 2006, Zambrano, Peavy, Santana, Colon, and Willis were on the short list of true top-starting pitchers to perform and this year is not much better. Faced with this report, perhaps more teams will hold back their starting pitchers to make sure they are ready for the major-league season.

The evidence is not air-tight - with a p-value of .06 this difference could be due to chance, but if I were an owner, a GM, or a pitcher myself, the 2006 data would give me pause. After the 2009 season is over, more data will be available and it will be interesting see whether this evidence is confirmed or refuted. As for now, it may be wise to avoid any WBC pitchers this April in your fantasy league.