Also known as the Pythagorean Expectation, a formula concocted by Bill James, to estimate how many games a team should win based on the number of runs the team scores versus the number of runs the team allows over the course of a season.
How good is this formula, really?
Recently, I had way too much time on my hands for a couple of days, so I compiled some stats from one of my worlds for the total number of runs scored and total number of runs allowed by each franchise over the history of the world. Probably not very meaningful, but I was looking for a way to pass some time. After I compiled this information, I then decided to calculate the expected win percentage for each franchise over the history of the world and compare it to actual wins. Was looking to see who the over-achievers were, and who the under-achievers were.
The results were kind of surprising.
Most surprising was the fact that the one team that had (a) won the most games over the history of the world, (b) scored the most runs, and (c) allowed the fewest runs, came in as the biggest under-achiever of all 32 franchises, clocking in at (-67) wins below expectations.
In fact, of the five winningest franchises in world history, four of them came in among the five biggest under-achievers. The fifth one came in as a slight over-achiever, at +9 wins above expectations.
So this makes be wonder: is my approach flawed in my looking at too large of a sample size (19 seasons) with teams that may have gone through multiple ownership changes and many cycles of up and down seasons? Or is there some sort of inherent flaw in the theory behind the formula itself that means that any correlation between what the formula "predicts" and what happens in reality is little more than coincidence?