Posted by dahsdebater on 2/19/2012 4:36:00 PM (view original):
I think that the confounding you are suggesting is much more of a problem analyzing offense than defense. That's a big part of why I didn't bother. Also because it's easier to learn about offensive efficiency based on the eye test, just watching what happens. The game compiles offensive stats for us. No such advantages for defensive stats. Also, other than double teams, which are uncommon enough not to be a major confounding problem, the only defensive setting that coaches can adjust is +/-. I figure over a large enough sample those should largely balance each other. Also, a large portion of the sample was sim coached, so they always come out in a 0. I really think the biggest flaw in the analysis is the assumption of starter-on-starter matchups.
the confounding you talk about here is all those little side things, IMO - the implicit variables, like what a coach does and who else is playing - not the ones in question like ath and def and spd. not to suggest they aren't important, the starting thing is probably pretty important.
what i am saying is, in general, and for defense too - i think the biggest problem is the lack of having independent variables in the key variables you are looking at - the players' ratings. you say you focus on defense - so on defense, you have ath, def, sb, and reb that are all of huge importance - and ath and def are very strongly correlated, as well as sb and reb. any judgement you make about any of those 4 stats, based on a regression, is going to be very shaky because you never know how much of what you are seeing is a result of the variable in question, or the variable that depends on it.
for example, a player with great ath almost always has great def. so, you look at the impact of ath on fg%. you see a player with 70 ath might result in his opponent having 10% higher fg% than 90 ath (obviously an over statement, but to keep #s simple...). well, if you could say, give me a 70 ath 70 def player, and compare him with a 90 ath 70 def player, you may be able to say look - that 10% higher FG% is due to the 20 ath. however - because the 90 ath guy probably has 90 def - now all you are saying is, going from 70/70 to 90/90 is 10%. well, is the 20 ath 9% and the 20 def 1%? or 20 ath 1% and 20 def 9%? or 5% each? how do you know? same goes for sb and reb!