This morning I ran across a cool post called the Hall of Fame ballot collecting gizmo. It's simply an amalgamation of Hall of Fame ballots that have been made public. There are 85 as of the time I write this post (and the gizmo updates frequently).
I could debate who should or should not be in, but instead I'd rather offers some cold, hard facts to wrap up 2013. It turns out that we can figure out quite a bit from only the 85 ballots the gizmo has so far. The data already suggests a couple surprises to me at least.
Let me briefly explain the statistical theories underpinning this analysis. Disclaimer: I'm an amateur statistician, to put it lightly, though at least an amateur statistician with a math degree. Let's treat the 82 ballots as a sample of a population, which is reasonable because that's exactly what it is.
I used the central limit theorem, a null hypothesis of .75, and a 99% confidence interval for my analysis - which, in English, means I asked, "If I pick 85 ballots at random out of all the Hall of Fame ballots, what kind of percentages could I observe where a player's final vote total turns out to actually be 75 percent?" Intuitively, it's not too hard to understand. I'm treating the current 82 ballots as one subset of the total population of ballots, and saying that there's still lots of room for the final percentage to fluctuate as more votes get tallied...but, at the same time, the current votes probably say something about what the final total will look like. I picked 75% because that's the magical line where a player becomes a Hall of Famer.
The coolest thing, theoretically, about this method is that it says our samples will create a normal distribution no matter the distribution of the population - in other words, I don't need to know anything about what the actual population looks like, other than the vote totals (which we know!) to get some results.
Now the fun part - the results!
With 85 ballots, we can say with 99% certainty that a candidate listed on between 62.9% and 87.1% of the 85 ballots cast will actually end up with 75% of the vote. That's a huge spread, as might be expected with such a high confidence interview and relatively few ballots cast. It actually tells us pretty much everything we need to know though when we look at the current vote totals. Allow me to explain. Players not in that range are highly unlikely to have a vote total at 75% - moreover, players below 62.9% are highly unlikely to get to 75%, and players above 87.1% are highly unlikely to dip below 75%.
Three players are above 87.1% in the 85 ballots cast - Greg Maddux (100%), Tom Glavine (98.8%), and Frank Thomas (88.2%). Already, the data suggests that the class of 2014 will have at least three inductees.
Most vote totals fall shy of the 62.9% mark, as would be expected. This includes all the usual steroid suspects - Bonds, Clemens, McGwire, Sosa, and Palmeiro. No surprises there.
In fact, there are only four players that fall in the 75% band - Craig Biggio (81.2%), Mike Piazza (74.1%) and Jeff Bagwell (63.5%) and Jack Morris (63.5%). Intuitively, it's easy to guess that Biggio's chances are very good, Piazza is right on the bubble, and both Bagwell and Morris are probably on the outside looking in. The calculations confirm that, though more strongly than I expected. Biggio's vote total is 92.8% certain to stay above 75%, Piazza has 42.5% chance to make it, and both Bagwell's and Morris's odds stand at a measly 1.4% to get up to the Hall of Fame line this year. Bagwell is a safe bet to get in some year, but this is Jack Morris's last year on the ballot.
Even with only 85 ballots counted there is an extremely good chance that this year's Hall of Fame class has four inductees - Greg Maddux, Tom Glavine, Frank Thomas, and Craig Biggio. That would be a historically large class. The last time four players went in on the same ballot is 1955. If Piazza makes it in, then the 2014 class would tie for the largest ever - tied only with 1936, the first class ever inducted into the Hall of Fame.
We will see what actually happens. Get ready for a historically large class that defies all the complaints about a crowded ballot though, for better or worse.