First, points 1-4 seem to me quite right and I have no quarrel with them. So the question would be: what is the best way to measure such an effect OR to identify it as a reality if it escapes QUANTITATIVE measurement (which as you point out is probably rarer in baseball than in some other areas of life).
Case 4 a) as you say, is not very probable.
4 b) has a shot but is more or less as you suggest - our tools need sharpening
4 c) is a very valuable discussion. I will take only one part of it: you are absolutely right about anecdotal evidence: we should not take individual events and use them to jump to broad conclusions. But that does not mean they have no value.
Here we have the distinction in social science between nomothetic and idiographic. Nomothetic tries to establish what the "laws" are that govern identifiable phenomona or activity that we are studying. What causes growth in economies? Why do some countries adopt democracy more readily? What makes for championship teams?
Idiographic: The Greek root graphic refers to writing about something, while "idio" comes from the Greek word for individual or single person. The word "idiot" in ancient Greece meant someone who stayed out of political activity, who did attend the assemblies, but kept to their own private life.
Here though it refers to individual events. This is the domain not of the sociologist, economist, or political scientist but of the historian and the journalist (BIG debate in anthropology whether it is nomothetic or idiographic with a postmodernist wing weighing in as well. When they sort it out I will let you know who won).
Historians, and their subgroup biographers, write about the 30 Years War, the Election of 2000, about 9/11, or the life of Winston Churchill or Addie Joss.
They are not trying to suggest that all wars, leaders, elections, terrorist attacks, pitchers are like this or work this way. Just telling the narrative of a singularity.
Now, an individual case, even one that is extended like the use of the shift against Ted Williams in the 1946 World Series raises some issues. First, did it happen? If so, how do we know?
Let's say we have films of the infield that were taken of every time Williams came to bat and we note that the infielders are all shifted over the right side to a great degree than with other batters. Every time.
We still have some epistemological issues to confront (epistemological means "how do we know what we know?"). Do we know that the infielders are shifted that way not because of a series of individual decisions to move to the right side, but because of a conscious strategy on the part of the Cardinals? And do we know that this was the case with all infielders or only one or two, the others playing where they felt like based on their own knowledge of baseball?
We "know" to the extent that we do, because the Cardinals told us. Their manager, their players, have provided individual testimonies, that are more or less in harmony with each other's accounts, that this is what they did.
Do we know now? Well, what if they just want us to think that this is what they did? That they want us to think that this was a brilliant plan they had all along but are telling us so after the fact? Why would they do this? To look smart and also to look like their victory in the WS was because of their smarts and therefore a credit to them and not the result of being luckier just enough times, and to take credit for something that may or may not have been the result of their actions: Williams' miserable performance at bat in the Series. It is possible Williams was just in a "slump" - if such things exist (I am not sure if they can be shown to exist statistically or not, or whether they can be explained given a statistical significance etc.)
Now here the individual testimonies are crucial to the narrative that gives meaning to what we see when we take a photo of a hot air balloon above the stadium of the Cards' defense during the '46 WS with and without Wiliams at bat. Without the narrative the photo has no meaning. We don't even see a baseball game, just some guys in uniforms and gloves standing around. The numbers NEVER speak for themselves and a picture IS NOT WORTH A THOUSAND WORDS. (God also does not work in mysterious ways, but that is another conversation for another time).
There are all sorts of problems with individual memory and testimony. The closer the testimony is to the event the more likely memory is to be accurate BUT the more likely the person is to have a stake in the "spin" the interpretation of the event in their own self-interest.
BUT, in court, with life and death, freedom and incarceration, justice and injustice, law and punishment, wealth and hardship at stake, we accept individual testimonies as legitimate evidence. We do the same in journalism, or at least we did back when journalists did journalism and did not just interview each other on talk shows. (Disclaimer, I am married to a journalist, who, however, is a strong exception to the generalization about today's journalists just mentioned).
In doing so, in accepting individual testimony, and not only statistical evidence AS evidence, and in the most important contexts imaginable, and one could add that an umpire's ruling is an individual testimony with all the strengths and weaknesses those entail, we do a qualitative sort of quantitative evaluation of the worth of such testimony. In other words, we try to figure out to what extent such testimony approximates X value, with X being a perfect reconstruction of the events as they happened. We do this by interrogating the testimony as a form of evidence: how long ago was it? How close was the person to what happened? How reliable do they seem as a person? and most important of all: is this an eye witness account, or second hand.
In court we do not allow most second hand testimony, though we do sometimes (cellmate testifying that the defendant confessed to the crime for example) but with some caution (or ideally this is how we do it). In journalism we are a little looser: usually, at least under the old rules when editors did their jobs (NY Times and Judith Miller I am looking at you; Fox News I am not even bothering, it's pointless - oh by the way, my daughter, born of a US citizen overseas, is eligible to be President of the US some day since she is a natural born citizen under the nationalities act of the 1780s, as are the kids of all the US soldiers I teach for a living who are based overseas and who were born to their parents in foreign countries. In other words, since Pres. Obama's mother was from Kansas, even if he WERE born in Kenya, Indonesia, or the Soviet Union or the North Pole, he is still a natural born US citizen and eligible to be President. So his birth certificate from Hawaii was irrelevant. Completely. Doesn't matter. If the law were otherwise it would mean that: John McCain was not eligible to be President; the children of US troops overseas are not eligible, and that by serving their country they have disenfranchised their own kids from being Americans, turning them into immigrants. Try telling them that, I know hundreds of them. I will be happy to introduce you to them. But I digress).
The rules of "old journalism" were that the first source had to be eyewitness and in a position to know and if a political question or one involving a large company or organization had to be high up. The second had to be pretty close and also able to confirm independently. Third sources were preferred when possible but in this case testimony could be second hand so long as the source was credible. The more sources the better.
But not all sources are or were equal, so that we have a means of quantifying, through approximation, the value, at least the relative value, of qualitative testimony. We should do something similar in baseball when it can be useful (not always, when it serves a purpose though).
Take an analogous situation to the SB/CS one contrarian23 has discussed above: power hitting in the post-deadball era changed pitching. Pitchers now had to bear down on every hitter, every pitch or risk a run scoring. Before they could be slackers half the time against weak hitters with a deaball, and home runs being rare. Now, we do find that from 1920 or so IP per season go way down and almost no one except Wilbur Wood really tries to pitch as often as before and pitch every inning of every game and so on.
Now what we really have is a bunch of circumstantial evidence: the ball changed, home runs became more common, pitchers pitched fewer innings. Cause and effect are very, very loosely connected here. What connects them is testimony - we have a lot of people from that time saying that pitchers now had to bear down, which is a way of saying that they did not before which is slander in most places and times. But it may be true. The testimony and the empirical evidence TOGETHER make the thing more likely. Neither on its own is strong enough in my view. But both need a context: a narrative that gives meaning to the change that occurred, just as the Williams shift is a narrative that gives meaning to what otherwise are just a bunch of facts and imaginary photos we took from a blimp.
The narrative is that pitching was changed by the end of the deadball (and the spitter) and the rise of the home run.
Such narratives can and should be challenged, and questioned. And using statistical and empirical methods to challenge them is an important and useful way.
But just as we cannot throw out the baby with the bathwater and declare there to be no external world because our language cannot fully capture it, and end up with the self-indulgence of the postmodernists (old joke: postmodernist anthropologist interviewing natives of a primitive tribe about heir culture: "But enough about you. let's talk about me"), we should not declare something non-existent merely because we don't find a statistically significant difference that shows us it is there. If a bunch of old players say, "Oh yeah, when Coleman was on we were thrown off and batters took advantage, hitting into the hole, etc." we note it, but take it with a grain of salt unless we see lots of such testimony and even then it could be a combo of memory of one or two events and having heard such discussions by other players from the time thus creating a collective false memory. But if players at the time when Coleman played were already saying it, that is stronger, since memory plays less of a role. Then when we see biographies and autobiographies of managers telling a similar tale it is strengthened. And so on. Evidence piling on evidence, each weighed for its approximate relative value. At that point if we also find empirical evidence that backs it up - hitters had higher averages when he was on base, we have a strong case for the jury. But this is analogous to the "threat of a home run at any time" narrative about pitchers after the deadball ended.
One last note: there was a similar movement in history to sabermetrics some decades ago, highly sophisticated statistical analyses. it still has influence, though more in political science than in history. One of its great findings was that the industrial revolution never happened. Nope. Can't find any statistically significant "leap" in English production from 1780-1830 (the decades of the industrial revolution) compared with decades before and after. Nor any 5 or 10 year period that dramatically stands out.
So the industrial revolution is a myth. Statistically speaking. Here I can only say: read Arnold Toynbee's original great lectures on the industrial revolution (you can find them online I think), and see how he defines it, and then draw an opinion about the stats on British growth, (also online). I am with Toynbee (and Karl Marx, who coined the phrase "industrial revolution").
Cause, to think that there was no industrial revolution is kinda dumb. really.