From
this thread:
c. Normalization. In determining the outcome of any PA, as mentioned above, WIS takes into account the year and league in which the player actually played. They do this through a process called Log5 Normalization. WIS also incorporates the park in which the PA occurs, and whether or not the batter has the platoon advantage versus the pitcher. Here is an example given by WIS (from 2009, so this may not reflect subsequent updates, though I believe it is still accurate):
1923 Babe Ruth facing 2000 Pedro Martinez in a neutral park – how often will Babe get a hit against Pedro?
Inputs:
- Ruth’s actual batting average (.393)
- 1923 AL league batting average (.283)
- Pedro’s actual opponent batting average (.167)
- 2000 AL league batting average (.276)
- Babe is left-handed and Pedro is right-handed (Babe gets a 4.5% advantage from this)
- The batter carries more weight than the pitcher in determining whether the outcome of a PA is a hit (WIS uses a 53.3-46.7 split in favor of the batter rather than 50-50)
Here’s the formula before the platoon adjustment:
H/AB = ((1.066*AVG * .934*OAV) / LgAVG) /
((1.066*AVG * .934*OAV) / LgAVG + (1.066- 1.066*AVG )*(.934- .934*OAV)/(1-LgAvg))
Where, LgAVG = (.934*PLgAVG + 1.066*BLgAVG)/2
The 1.066 and .934 reflect the 53.3-46.7 weight in favor of the batter. The output of this formula is a .2502 chance of a hit, which WIS increases by 4.5% to .2614 since Babe has the platoon advantage. Park factors would also increase or decrease the final result.
If you’re savvy with Excel, you can input this formula and play around with it to see how the outcome changes as the League Averages change. In other words, you can see how much normalization matters. If Ruth came from a league like the 1908 NL where the overall average was .239 instead of .283, the outcome would be .286 rather than .261. Because Ruth’s average would have occurred in a more challenging environment for hitters, the formula gives him more credit for his .393 performance and the expected outcome is more likely to be a hit. Likewise, if Ruth had performed in the 1930 NL (league average = .303), then the outcome of this calculation would be .252: Ruth’s performance is discounted somewhat because it occurred during the most-inflated offensive context during the 20th century.
If your eyes glaze over when you read the last couple of paragraphs, don’t worry about the formulas. You never have to actually do these calculations. What’s important to remember is that the league the player actually played in matters. And therefore when I search for players to draft, I almost never look at the raw stats, I look at the stats that are followed by the “+” and “#” signs, which have done all of the work for you. The “+” signs compare the players performance in that stat to the league average that season. Numbers greater than 100 are better than average, less than 100 are below average. The “#” signs basically show you how the Log5 Normalization calculation comes out against an “average” opponent. In general, when comparing players I am much more interested in how their “+” and “#” stats compare rather than how their raw stats compare. All else being equal I want the guy with the higher “+” number and the better “#” number.
For example, Carlos Beltran in 2006 hit 41 HR in 617 PA for the Mets. Cy Williams in 1923 hit 41 HR in 636 PA for the Phillies. Williams’s HR/100PA# is 9, while Beltran’s is 6. Williams will hit many more HR, all else being equal, than Beltran will in WIS.