*Posted by hughesjr on 11/26/2012 4:23:00 PM (view original):*

*Posted by coach_billyg on 11/26/2012 4:05:00 PM (view original):*

*Posted by hughesjr on 11/26/2012 3:33:00 PM (view original):*

BTW, it is not just adding numbers together ... it is taking 100%, splitting that up into component parts, multiplying that percentage against the attribute, and adding those corrected values together.

By doing it this way, all your overall scores in any area will be in the range of 0-100 when comparing them to each other.

what you are describing is effectively, adding the ratings together. you are normalizing to a scale but thats not really the point.

to come up with the "right" formulas, dahs is right, adding definitely does not do the job. i wrote myself a program, similar to yatzr's but trimmed down, so i could rank recruits by custom formulas and recruit off it - although it was more about being able to play with the formulas, than actually recruiting (and i never really used it to recruit, but i played with formulas a **** ton). you minimally need cross products. for example, as we've been discussing, if you have good ath, lp is more valuable. if you have good spd, per is more valuable. so, to really capture the detail - you need an offense equation that would be something like-

3 point shooting = a*ATH + b*SPD + c*PER + d*BH + e*SPD*PER + f*BH*PER

Billy ... what is factor 'e' and factor 'f'

I am doing the a*ATH+b*SPD+c*PER+d*BH

Now if you effectively define (the way you are describing) a, b, c, d ... and if none of the other attributes matter, you would add those up and they would equal 100.

What would the e*SPD*PER do that you could not do with b*SPD and c*PER ?

Also think I may be doing the same thing by having 5 different "offensive types" ... in one type LP and ATH are very important and in the other type, PER and SPEED are more important. Which ever number is higher (between those types) would be how that player is rated ... I'll post an example

i know what you mean, in your last statement, i was going to say something along those lines. we SORT of approximate it by saying for a slashing guard, ath/lp are more valuable, that kind of stuff. its a similar thing, but sort of in a different dimension.

so, heres the thing. for a 3 point shooting guard, for example, ath/lp are less important than other scoring guards, and spd/per are more valuable. but EVEN WITHIN THE SET of 3 point shooting guards - the value of spd and per vary together. those work off each other. so, to simplify this, lets assume spd = per in value for a 3 point shooter. let me give you an example. under that conclusion, a guy with 50 spd and 50 per, if you use a coefficient of 1 (i dont support the effort to put things on a 1 to 100 constraint and it gets more difficult when you add multiplicative factors, like e * SPD * PER, so i will not be normalizing to that range here) - anyway, using a coefficient of 1, that gives the player a rating of 100, from spd/per. well, 60 spd/per takes him to 120, and 70 spd/per takes him to 140. but that is not an accurate representation. the value of per is HIGHER when you have the spd to support it. instead of 50, 60, 70, 80, 90 spd/per being worth 100, 120, 140, 160, and 180 points, respectively, maybe its 100, 125, 155, 190, and 250. in reality, a 50 spd/per d2 guard isn't very good, and 60 spd/per isnt much better. a 70 spd/per guy can be pretty useful, an 80 spd/per guy can be a very good scorer, and a 90 spd/per scorer can be elite. that is a non linear relationship!

and that is really the key. so, to get around that, you need a different relationship. dahs suggests a quadratic on every variable, and multiplying them, but i am suggesting combining ratings and combinations of ratings, instead of your straight linear model.

so, to get back to the actual equation, say you use:

3 point shooting complex = a*ATH + b*SPD + c*PER + d*BH + e*SPD*PER + f*BH*PER

instead of

3 point shooting simple = a*ATH + b*SPD + c*PER + d*BH
lets say in the simple model, you used: .3*ath + .9*spd + 1.1*per + .7*bh:

well, in the complex model, you might use: .3*ath + .7*spd + .7*per + .5*bh + (.4/50)*spd*per + (.4/50)bh*per, just as a simple example. you can also think of that spd*per term as .4*spd*per/50, or .008*per*spd

the reason I show the coefficient at .4/50 is to make it easier to understand. if i put that tiny coefficient, you might think, why is that even worth including? well, you are multiplying it by speed AND per, so thats going to be much larger values. if spd = 50, then that spd*per term is .4*per (same as .4*50*per/50). and really, that

(.4/50)*spd*per can be read, and maybe its easier for you to write it this way, as .2*per*(spd/50) + .2*spd*(per/50). its the same, but maybe easier to wrap your head around. if spd = per = bh = 50, then those terms basically break down from

.2*per*(spd/50) + .2*spd*(per/50) + .2*per*(bh/50) + .2*bh*(per/50)

to

.2*per + .2*spd + .2*per + .2*bh

which, when you put it with the first part of the complex equation, is the same as the simple equation. so its saying that around the 50 level, the values of the ratings are the same as in the simple equation like yours. but at higher values, say per = spd = bh = 75, then its like having .3s instead of .2s in the above part, which makes spd, bh, passing all more useful as the ratings get higher. basically, the goal here is to encapsulate the reality that, the marginal value of per is both increasing in and of itself, but increasing with spd, too. the fact that the marginal value of per is increasing in and of itself, is the reason that dahs's use of exponents make some sense. you really probably get the best regression with something like this:

3 point shooting = a*ath + b*spd + c*per + d*bh + e*spd*per + f*bh*per + g*ath^2 + h*spd^2 + i*per^2 + j*bh^2 you might even make those squared exponents variables in the regression, but its the same idea...

anyway, getting back to it, using the simple equation and the complex equation:

simple - .3*ath + .9*spd + 1.1*per + .7*bh

complex - .3*ath + .7*spd + .7*per + .5*bh + (.4/50)*spd*per + (.4/50)bh*per

then if you have a player with 50 ath, and either 50/per/bh/spd, or 60, 70, 80, 90 - then here are the values he might have (note i was looking at a graph on google, values not exact, im rounding)

complex player ratings: 50 ath, X per/bh/spd

50: 150

60: 186 (+36)

70: 226 (+40)

80: 269 (+43)

90: 316 (+47)

you can control that to make different relationships, especially when you include exponents, you can really get whatever model you want. contrast that to the simple model:

simple player ratings: 50 ath, X per/bh/spd

50: 150

60: 177 (+27)

70: 204 (+27)

80: 231 (+27)

90: 258 (+27)

so, as you can see, in that model - you have significantly less flexibility. you have a linear relationship as values of ratings are increasing, but its not really accurate. i think its accurate to say, in d2, for a guy who sucks at spd/bh, going from 50 to 90 per is really not very useful at all, but for a guy with great spd/bh, going from 50 to 90 per is HUGE. you just can't capture that effect with the simple formulas we've been tossing around in this thread. still, that doesn't mean its not useful to talk about them - if you get too complex, it gets harder to follow, harder to wrap your head around, and fewer coaches can follow it. note that you can talk in just as much detail by saying, for a player like: 50 ath, 80 spd, 70 per, 70 bh - the values of the marginal ratings are, ath = .3, spd = .9 (1 point of speed is worth 3 points of ath), per = 1.1, bh = .7, or whatever you think it is. thats why in an earlier post, i was saying, well for a player with this set of ratings, here is how i value ratings, and for a player with that set of ratings, thats how i value ratings there.

the above lets you give a lot of detail in a way that people can understand, but it doesnt define what the relationships are at every stage. however, if you give a few different players and their ratings, i think people can extrapolate decently. on the other hand, its very difficult to come up with the equations that result in a curve that matches those relationships, and properly extrapolate in between - and further, if you saw that "perfect equation" - it would be really hard to think in your head, what that might actually mean for a particular player.