thoughts on the Mariners, MLB draft, and more homelinksdraftabout me

Projected MLB Standings: Review And Preview

Yankee StadiumThanks to the All-Star Break, there's not much use in doing another round of projected standings this week. Instead, I offer an overview of the projected standings for the first half, and what they may indicate about the second half. The data is fairly simple to read, especially if you are familiar with box plots.

The following table includes every MLB team, and a summary of their projected wins for the first half. The first column (7/6) is the projected win total for that team on July 6, the most recent projected standings. The next five numbers are the five used for a box plot: minimum (min), Q1, median (med), Q3, and maximum (max). Maximum, minimum, and median are rather self-explanatory. Q1 and Q3 aren't as well known, but they are just as simple. Basically, if a set of data is split into four equal parts, the median is the number that splits the whole thing in half, Q1 is the number that splits the lower half in half, and Q3 is the number that splits the upper half in half.

This means that half of the data lies between Q1 and Q3, so calculating the Inter-quartile range (the difference between Q1 and Q3) is a simple way to analyze variance in the data (more on inter-quartile range later). Also, I should note that maxes and mins with asterisks are ones that were adjusted. If the true min or max of the data was an outlier (which I defined to be farther than twice the inter-quartile range from either Q1 or Q3, depending on whether the min or max was being evaluated), then it was thrown out of the data set. Without further ado, here is the table:

Blue Jays816976798384
Devil Rays615758616566
Red Sox102100102105109114
White Sox706668748085

First of all, the table gives a nice summary of the first half of the season. However, it also lends some insight into what might unfold in the second half. To start with, I will focus on the inter-quartile range (from now on abbreviated IQR). As previous stated, the IQR quantifies the variance in a set of data, which in terms of the projected standings shows how consistent or inconsistent teams have been. Teams with a small IQR have been very consistent, while ones with a larger one have been more up-and-down. The average IQR was 8. The Indians had the smallest IQR with 1, which means they are about the safest bet to win 93-94 games that you'll ever see. Other teams with low IQRs were the Marlins, Twins, and Yankees, all with four. So, don't expect any of those teams to make big runs, even though many expect to see the Twins and Yankees surge.

On the flip side, there are plenty of teams with large IQRs, led by the Washington Nationals with a whopping IQR of 20! Unfortunately, they are a really bad team, so it's pretty much the difference between 100 losses and 120 losses in their case. On the other hand, the Mets (IQR 19) are a good team most likely heading to the playoffs. However, it will be interesting to see if they just squeak into the playoffs, or if they are the class of the National League by the end of the year. Based on their first half, they could be either. Making matters even more interesting in the NL East is the fact that the Braves, with an IQR of 11, are also hard to predict. So, as of now, the Braves could end up winning the East if everything fell just right, or the Mets could run away with the division again, or anything in between could happen. Other teams with high IQRs are the Brewers, Giants, Mariners, Reds, Rockies, and White Sox.

Of course, all this analysis is dependent upon teams not changing much in the second half. Between injuries and the trading deadline, that is unrealistic. Really, the statistics show just how beautiful baseball is. Without trades and injuries, the variance that teams show makes the season unpredictable. Add in trades and injuries and it's a recipe for great intrigue and drama.