Interesting stuff, tianyi. Thanks for putting in the time to run these statistics, and thanks for sharing it with us.
While interesting and informative to a degree, I still hesitate to draw any solid conclusions from your results for the following reasons:
1) I think your method of sampling probably skewed the results significantly. Sampling is such a critical component of hypothesis testing, and in this case, the sample is not truly random, nor can it be considered representative of the population of players in Allen D3. Selecting only the top 25 rpg leaders carries an inherent bias. It ignores PGs, SGs, and SFs completely, and as billyg said, it also includes mostly high REB players on bad teams. I know it'd be a pain in the butt to compile data without using the top 25 reports. But, in using only the top 25 reports, you ignore some critical data points, imo.
2) The more I think about it, the less I believe defensive rpg is the best statistic to use in conclusions about rebounding ability. I think billyg brought up a very good point when he theorized that a rebound is first determined on a team level, and then on an individual level. This may be a stretch, but perhaps defensive rebounds actually work somewhat like assists. This could explain why Carlos Moss went on to average 12.6rpg/40 min after Steed graduated…despite only having a REB rating of 65. (That 12.6 would put him at #24 on your list). I’m not suggesting that rebounds are complete window dressing like assists, but perhaps the individual component of it is.
3) That said, I’d be VERY interested in seeing a multiple regression including ORPG as the dependent variable instead of DRPG. In real life, many consider offensive rebounds to be a greater measure of individual rebounding ability than defensive rebounds. So perhaps HD is coded in the same way. There is apparently an individual match up component of rebounding (according to the release notes)…and this may be reflected in ORPG much more than DRPG.
4) Lastly, this is just a minor quibble. But, many statisticians would critique what you said here: “Reb's statistical significant is very large.” Once a p value is set in null hypothesis testing, your results are either significant or not significant…that’s it. Some scientists don’t have an issue with the way you described the significance…and I'm certainly no scientist...so feel free to ignore my personal pet peeve.
