Three-way battles Topic

Posted by oldwarrior on 3/5/2022 11:24:00 AM (view original):
I agree that lowest team in the 3-way battle seems to win with a lot more frequency than the odds shown.

I've tracked 2-way battles since HD3 was released. I think that's over 4 calendar years?

A small sample size of just under 140, but even at 140 the margin of error should be greatly reduced.

In the 138 battles in which I have been a favorite, with the average of 63%, I've won 36%.
For the good news, when I'm 70+, I have won a little over 50%.

Until I begin seeing different, when I see odds between 25-75, I'm assuming in reality it really was a 50/50 coin flip.

is it just me or does this just sound incredible egregious?
"In the 138 battles in which I have been a favorite, with the average of 63%, I've won 36%."

any chance you'd be willing to share the data set, or the p value on those 138 or whatever it is?
3/5/2022 10:18 PM
When I first started tracking win/lose as the favorite, I'm sure the coaches in the SEC/Allen were tired of me complaining every season. Like the season I lost 4 battles with odds of 64-74% on all of them. I think there was a stretch where I lost 10 or 11 straight battles I was favored in. If I remember, the odds based on % was like 6500 to 1 on that happening. At LSU I was around winning 28% on the first 40 or so battles.

The majority of these were top 100 overall guys. I've had a LOT of seasons with 8 man rotations.

3/6/2022 12:36 AM
Posted by gillispie on 3/5/2022 10:06:00 PM (view original):
Posted by shoe3 on 3/5/2022 3:47:00 PM (view original):
Posted by gillispie on 3/5/2022 2:19:00 PM (view original):
Posted by oldwarrior on 3/5/2022 11:24:00 AM (view original):
I agree that lowest team in the 3-way battle seems to win with a lot more frequency than the odds shown.

I've tracked 2-way battles since HD3 was released. I think that's over 4 calendar years?

A small sample size of just under 140, but even at 140 the margin of error should be greatly reduced.

In the 138 battles in which I have been a favorite, with the average of 63%, I've won 36%.
For the good news, when I'm 70+, I have won a little over 50%.

Until I begin seeing different, when I see odds between 25-75, I'm assuming in reality it really was a 50/50 coin flip.

wow... i agree that anything 100+ is a lot. i was gonna reply to shoe but didn't i guess, that i didn't buy his claim on the low sample size. its irrelevant that its a small sample compared to all battles, completely irrelevant. i was gonna say, even a single coach can get to 100+ and that is a plenty big sample size, and CERTAINLY a consortium of coaches could get up there.

anyway. someone like cub or someone else who knows statistics, should be able to put a % on it - the odds that your data is explainable by luck alone, some measure along those lines? by 140 i am thinking those numbers should be pretty damn meaningful, i don't see how luck is going to explain 63% vs 36%. i am sort of thinking all the lines of inquiry would mostly question the data set itself, the integrity of it, are all the battles really included etc. - but knowing you are the source, this sounds pretty damn problematic to me.
One coach’s tracking is a minute sample size. It doesn’t matter how many seasons it goes. The most you could draw from that is that the system was *possibly* biased somehow against that coach’s recruiting efforts (however the mechanics would look) over that period of time. Is that problematic in its own way? Sure, and I’m not saying that’s not possible, but that’s not the question at hand, which is a system wide brokenness, the idea that the odds are “inverted”.

In the context of 300+ coaches in a single world, with likely 1000+ battles in a season/world, a single coach’s 3-4 battles are meaningless. Taking that out to 150 battles over the span of 45 seasons doesn’t really matter in terms of showing odds inversion, because the system now has ~45,000+ battles to look at. Less than half of 1% of overall occurrence is not enough power to show clinical significance.

Now again, if we want to discuss whether it’s possible that the system can be made to be biased against individual users, that’s a different discussion.
'Less than half of 1% of overall occurrence is not enough power to show clinical significance.'

respectfully, this is just not how statistics works. as long as the sample is a random sample it should serve. a coach diligently tracking their entire set of battles with odds would qualify.

math is not your thing, this one is unambiguous.
That is literally how statistics work in clinical trials, which was my job for a decade. You need a large enough sample, with clear methodology and reproducible results, in order to have the power to show clinical significance. A sample of .003 of the entire pool, taken from a single subject does not approach that power. It would be laughed out of any investigator meeting, especially post Wakefield 1998.

Gil, you are trying to make this small set of figures mean something it should not represent. The fact is, .003 does not represent enough of the battles to tell us anything meaningful about *systemic issues* other than the kind of luck a particular user has been having over that period of time. If it is prolonged, *at best* it could indicate a bias against that user, which as I said is indeed problematic. But that is a different discussion.

And of course, that’s all assuming we could completely trust methodology, which… look I have no reason to doubt or trust anyone here, but the whole point is that WIS is the only entity with *actual* reliable data. I wouldn’t even trust my own data, not just because of my attention deficit, but also because it is very easy to miss battles the way the system is set up, battles that may emerge late and you didn’t even know they appeared (those would likely be battles won as the leader, with someone sneaking in late, which does happen fairly often).

Of course everyone remembers bad beats, they have a survivor bias in these discussions. But if you win when you were “supposed to”, hell you may never even know a battle happened.
3/6/2022 1:35 AM (edited)
Posted by shoe3 on 3/6/2022 1:35:00 AM (view original):
Posted by gillispie on 3/5/2022 10:06:00 PM (view original):
Posted by shoe3 on 3/5/2022 3:47:00 PM (view original):
Posted by gillispie on 3/5/2022 2:19:00 PM (view original):
Posted by oldwarrior on 3/5/2022 11:24:00 AM (view original):
I agree that lowest team in the 3-way battle seems to win with a lot more frequency than the odds shown.

I've tracked 2-way battles since HD3 was released. I think that's over 4 calendar years?

A small sample size of just under 140, but even at 140 the margin of error should be greatly reduced.

In the 138 battles in which I have been a favorite, with the average of 63%, I've won 36%.
For the good news, when I'm 70+, I have won a little over 50%.

Until I begin seeing different, when I see odds between 25-75, I'm assuming in reality it really was a 50/50 coin flip.

wow... i agree that anything 100+ is a lot. i was gonna reply to shoe but didn't i guess, that i didn't buy his claim on the low sample size. its irrelevant that its a small sample compared to all battles, completely irrelevant. i was gonna say, even a single coach can get to 100+ and that is a plenty big sample size, and CERTAINLY a consortium of coaches could get up there.

anyway. someone like cub or someone else who knows statistics, should be able to put a % on it - the odds that your data is explainable by luck alone, some measure along those lines? by 140 i am thinking those numbers should be pretty damn meaningful, i don't see how luck is going to explain 63% vs 36%. i am sort of thinking all the lines of inquiry would mostly question the data set itself, the integrity of it, are all the battles really included etc. - but knowing you are the source, this sounds pretty damn problematic to me.
One coach’s tracking is a minute sample size. It doesn’t matter how many seasons it goes. The most you could draw from that is that the system was *possibly* biased somehow against that coach’s recruiting efforts (however the mechanics would look) over that period of time. Is that problematic in its own way? Sure, and I’m not saying that’s not possible, but that’s not the question at hand, which is a system wide brokenness, the idea that the odds are “inverted”.

In the context of 300+ coaches in a single world, with likely 1000+ battles in a season/world, a single coach’s 3-4 battles are meaningless. Taking that out to 150 battles over the span of 45 seasons doesn’t really matter in terms of showing odds inversion, because the system now has ~45,000+ battles to look at. Less than half of 1% of overall occurrence is not enough power to show clinical significance.

Now again, if we want to discuss whether it’s possible that the system can be made to be biased against individual users, that’s a different discussion.
'Less than half of 1% of overall occurrence is not enough power to show clinical significance.'

respectfully, this is just not how statistics works. as long as the sample is a random sample it should serve. a coach diligently tracking their entire set of battles with odds would qualify.

math is not your thing, this one is unambiguous.
That is literally how statistics work in clinical trials, which was my job for a decade. You need a large enough sample, with clear methodology and reproducible results, in order to have the power to show clinical significance. A sample of .003 of the entire pool, taken from a single subject does not approach that power. It would be laughed out of any investigator meeting, especially post Wakefield 1998.

Gil, you are trying to make this small set of figures mean something it should not represent. The fact is, .003 does not represent enough of the battles to tell us anything meaningful about *systemic issues* other than the kind of luck a particular user has been having over that period of time. If it is prolonged, *at best* it could indicate a bias against that user, which as I said is indeed problematic. But that is a different discussion.

And of course, that’s all assuming we could completely trust methodology, which… look I have no reason to doubt or trust anyone here, but the whole point is that WIS is the only entity with *actual* reliable data. I wouldn’t even trust my own data, not just because of my attention deficit, but also because it is very easy to miss battles the way the system is set up, battles that may emerge late and you didn’t even know they appeared (those would likely be battles won as the leader, with someone sneaking in late, which does happen fairly often).

Of course everyone remembers bad beats, they have a survivor bias in these discussions. But if you win when you were “supposed to”, hell you may never even know a battle happened.
if you understood what was happening in those clinical trials, you would understand the relationship between the sample size and the overall size *shrug*. like you said - you need a large enough sample. it doesn't matter how big the sample is relative to the overall size. what you are saying effectively is 'to measure this fair coin, this 50-50 coin, if its going to be flipped a thousand times, i might need to flip it X times to have some faith in the coin - but if we were going to flip it a million times, i'd have to flip it much longer to have the same confidence!'. it just is logically incoherent, not really sure what else to tell you.

not really interested in debating statistics 101 with a guy who admittedly has a pretty dark view on mathematics, just letting you know what the truth of the matter is, if you'd like to ignore it, feel free!
3/6/2022 7:37 AM
Posted by gillispie on 3/6/2022 7:37:00 AM (view original):
Posted by shoe3 on 3/6/2022 1:35:00 AM (view original):
Posted by gillispie on 3/5/2022 10:06:00 PM (view original):
Posted by shoe3 on 3/5/2022 3:47:00 PM (view original):
Posted by gillispie on 3/5/2022 2:19:00 PM (view original):
Posted by oldwarrior on 3/5/2022 11:24:00 AM (view original):
I agree that lowest team in the 3-way battle seems to win with a lot more frequency than the odds shown.

I've tracked 2-way battles since HD3 was released. I think that's over 4 calendar years?

A small sample size of just under 140, but even at 140 the margin of error should be greatly reduced.

In the 138 battles in which I have been a favorite, with the average of 63%, I've won 36%.
For the good news, when I'm 70+, I have won a little over 50%.

Until I begin seeing different, when I see odds between 25-75, I'm assuming in reality it really was a 50/50 coin flip.

wow... i agree that anything 100+ is a lot. i was gonna reply to shoe but didn't i guess, that i didn't buy his claim on the low sample size. its irrelevant that its a small sample compared to all battles, completely irrelevant. i was gonna say, even a single coach can get to 100+ and that is a plenty big sample size, and CERTAINLY a consortium of coaches could get up there.

anyway. someone like cub or someone else who knows statistics, should be able to put a % on it - the odds that your data is explainable by luck alone, some measure along those lines? by 140 i am thinking those numbers should be pretty damn meaningful, i don't see how luck is going to explain 63% vs 36%. i am sort of thinking all the lines of inquiry would mostly question the data set itself, the integrity of it, are all the battles really included etc. - but knowing you are the source, this sounds pretty damn problematic to me.
One coach’s tracking is a minute sample size. It doesn’t matter how many seasons it goes. The most you could draw from that is that the system was *possibly* biased somehow against that coach’s recruiting efforts (however the mechanics would look) over that period of time. Is that problematic in its own way? Sure, and I’m not saying that’s not possible, but that’s not the question at hand, which is a system wide brokenness, the idea that the odds are “inverted”.

In the context of 300+ coaches in a single world, with likely 1000+ battles in a season/world, a single coach’s 3-4 battles are meaningless. Taking that out to 150 battles over the span of 45 seasons doesn’t really matter in terms of showing odds inversion, because the system now has ~45,000+ battles to look at. Less than half of 1% of overall occurrence is not enough power to show clinical significance.

Now again, if we want to discuss whether it’s possible that the system can be made to be biased against individual users, that’s a different discussion.
'Less than half of 1% of overall occurrence is not enough power to show clinical significance.'

respectfully, this is just not how statistics works. as long as the sample is a random sample it should serve. a coach diligently tracking their entire set of battles with odds would qualify.

math is not your thing, this one is unambiguous.
That is literally how statistics work in clinical trials, which was my job for a decade. You need a large enough sample, with clear methodology and reproducible results, in order to have the power to show clinical significance. A sample of .003 of the entire pool, taken from a single subject does not approach that power. It would be laughed out of any investigator meeting, especially post Wakefield 1998.

Gil, you are trying to make this small set of figures mean something it should not represent. The fact is, .003 does not represent enough of the battles to tell us anything meaningful about *systemic issues* other than the kind of luck a particular user has been having over that period of time. If it is prolonged, *at best* it could indicate a bias against that user, which as I said is indeed problematic. But that is a different discussion.

And of course, that’s all assuming we could completely trust methodology, which… look I have no reason to doubt or trust anyone here, but the whole point is that WIS is the only entity with *actual* reliable data. I wouldn’t even trust my own data, not just because of my attention deficit, but also because it is very easy to miss battles the way the system is set up, battles that may emerge late and you didn’t even know they appeared (those would likely be battles won as the leader, with someone sneaking in late, which does happen fairly often).

Of course everyone remembers bad beats, they have a survivor bias in these discussions. But if you win when you were “supposed to”, hell you may never even know a battle happened.
if you understood what was happening in those clinical trials, you would understand the relationship between the sample size and the overall size *shrug*. like you said - you need a large enough sample. it doesn't matter how big the sample is relative to the overall size. what you are saying effectively is 'to measure this fair coin, this 50-50 coin, if its going to be flipped a thousand times, i might need to flip it X times to have some faith in the coin - but if we were going to flip it a million times, i'd have to flip it much longer to have the same confidence!'. it just is logically incoherent, not really sure what else to tell you.

not really interested in debating statistics 101 with a guy who admittedly has a pretty dark view on mathematics, just letting you know what the truth of the matter is, if you'd like to ignore it, feel free!
Gil, once again, you are taking one small piece of what I said, pulling it out of context, getting stuck on it, pretending like this is the whole thing, and ignoring all of everything else. Don’t be absurd.

(Aside, I don’t really know what “dim view on mathematics” means; like when you said “math is not your thing” I thought maybe I told you once math didn’t excite me or something like that, and maybe that’s true, but turning it into “see shoe doesn’t math, so don’t listen to him when he’s skeptical of the idea that one coach’s sample represents the entire sample of battles over that period” is pretty disingenuous. I don’t look at this part like you do, for sure.)

Anyway, your 100 thing really is stats 101. Or rather, the concept that “100 random samples is enough” is for *very simple tests*. Like if you want to see if folks can tell the difference between two brands of light beer, that kind of thing. But the OP hypothesis is based on the idea that one person’s experience would have to be reproducible. What you are doing is essentially assuming that it is. You’re ignoring the conflicting evidence that cub offered in this thread already, suggesting that the results aren’t entirely reproducible. That’s extremely poor science. One person can’t give you good data because the set they have access to is always too small, related to the whole. And again, one person has limited validity, survivorship bias of bad beats is a thing, etc. Your lack of skepticism is really confounding here.

Now if you got 100 coaches to report trustworthy data for a recruiting season, then you would have something worth considering.

3/6/2022 8:51 AM
◂ Prev 12
Three-way battles Topic

Search Criteria

Terms of Use Customer Support Privacy Statement

© 1999-2026 WhatIfSports.com, Inc. All rights reserved. WhatIfSports is a trademark of WhatIfSports.com, Inc. SimLeague, SimMatchup and iSimNow are trademarks or registered trademarks of Electronic Arts, Inc. Used under license. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.