|
WTC Stats
Oct 4, 2017 14:53:56 GMT
via mobile
Post by professorlust on Oct 4, 2017 14:53:56 GMT
Seems like a load of bullshit to me. We're in trolling territory here. Que?
|
|
|
Post by Gamingdevil on Oct 4, 2017 15:10:39 GMT
If people want to chicken little and make Type 1 errors, by all means go for it. So in plain English, what you are saying is: "Cryx players winning a lot could have other causes than Cryx being a strong faction." Is that correct? Because I can see no other way for the conclusion of "Cryx wins a lot -> Cryx is strong" being invalid. Cryx is definitely strong, he's just trying to nuance the data to prevent people from jumping to conclusions like "Denny 1, Skarre 1, Coven, Wraith Engine, Revenant Crew and Dark Host all need to get nerfed!" Also, to the people assuming that all players at the WTC are all of the same (extremely high) skill level, with all due respect, but that's not true. Some countries don't have a very large meta and thus get very little training, some might only play a few times a month, but are still "the best" of their country, because there is no one else. Some countries had 2 or even 3 teams, more often than not, those teams are not all of the same level. If you took all the best players in the world and then divided them into roughly equal teams, then this hypothesis would fly, but some countries just don't have the talent or pool of people to stand up to the best of other countries, and that's fine, because like at any tournament, someone needs to lose for the winners to win. Personally, for instance, I think of myself as a middle of the pack player. I know and have played against most of the people of the winning German team. With any given list, I think I have maybe a 20% chance to win against them. Those guys are amazing, play a shit ton of games and know the game inside and out. I take the game seriously, but only get 1-2 games in per week. Most people, and thus most countries are like that. When we were doing the matchups, there were countries that were considered weak, some that were considered average to good and some that were considered strong. Not all teams are formed of the best players in the world, just regular people that like tournaments and want to join the most awesome warmachine experience in the world.
|
|
|
Post by 36cygnar24guy36 on Oct 4, 2017 15:15:08 GMT
Cryx is well above the power curve, that is just a fact. However as this data is from a team tournament some people will just not accept it as valid.
We will have to wait and see how Cryx do at big tourneys and conventions over the next 3 - 6 months before it will be widely accepted.
|
|
crimsyn
Junior Strategist
Posts: 389
|
WTC Stats
Oct 4, 2017 15:35:13 GMT
via mobile
Post by crimsyn on Oct 4, 2017 15:35:13 GMT
Okay, so:
H0: there is no relationship between playing Cryx in this format and individual wins** H1: there is a relationship between playing Cryx in this format and individual wins.
Because of the results of the chi-squared, we reject H0.
So, there is a relationship between playing Cryx in this format and individual wins. There are a number of possible explanations for this. What I can think of are:
1. Cryx is, as of the WTC, very powerful or hard to answer 2. People who play Cryx are just better at Warmachine so they should win much more often 3. Cryx still has a surprise factor, and people aren't prepared to deal with it 4. The pairing process and format is giving Cryx a lot of favourable matchups, because team captains are either getting their Cryx player into a good matchup, or throwing someone under the Cryx bus to try to get good matchups in the other four games.
Explanation 2 just seems unlikely because we've seen this before with Mad Dogs, Una2, etc. Explanation 3 also seems unlikely, given that people have known about the power of GF and DH for months, and they would have had time, before and after lists were made public, to prepare.
4 and 1 are both possibilities. But even if it is an artifact of the pairing process, I think it is still useful information. After all, if the best strategy against Cryx is to dodge, that also has some implications about where they are at in the meta.
My personal opinion is that: 1. A couple Cryx casters need to be looked at. Not saying the sky is falling and they definitely, totally, must be nerfed, but they should be looked at sometime over the next few months. 2. There might be some implications about CID and power creep, faction lobbying, etc., when DH is crushing everything right out of CID 3. For Mk.IV, it would be nice to clean up the rules around models dying, because the whole three-stage death process and the interaction between RFP and field promotion and recursion is just plain weird and clunky when you actually get into it, and I feel like there has to be a simpler and more elegant way. 4. The recent theme changes changed the game so much that the WTC data is out of date already anyways, especially given the very binary nature of Ghost Fleet. It feels like unless you have some very specific tools, which aren't all that common in the wild, the matchup between two equally skilled players becomes very difficult for the non-GF player.. Those tools are now more common because of new themes and merc availability, so that is going to have a huge impact on the matchups, not to mention that Cryx got some new themes as well.
** of course, this is all out of date because of new theme forces, so it is kind of pointless to talk about. Really, to make any serious analysis, what we need is stats on the next big individual event post-theme.
|
|
|
WTC Stats
Oct 4, 2017 20:45:59 GMT
via mobile
Post by professorlust on Oct 4, 2017 20:45:59 GMT
you're misoperationalizng the cryx variable if you're tutting to use chi squared as a predictor for individual performance.
Cryx total wins at WTC vs other Factions total wins tells us nothing about individual performance.
|
|
|
Post by octaviusmaximus on Oct 4, 2017 22:41:52 GMT
Hey professorlust, I have the barest of statistical knowledge so I have to assume that your numbers are correct but I really don't understand how you are drawing your conclusions. You need to help me understand why you are doing what you are doing.
Also, asking whether players had experience with Ghost Fleet? That seems a little odd because the answer is "yes" and "a whole lot", they also still lost.
|
|
|
Post by professorlust on Oct 5, 2017 0:56:59 GMT
Hey professorlust, I have the barest of statistical knowledge so I have to assume that your numbers are correct but I really don't understand how you are drawing your conclusions. You need to help me understand why you are doing what you are doing. Also, asking whether players had experience with Ghost Fleet? That seems a little odd because the answer is "yes" and "a whole lot", they also still lost. Pointing out that Type 1 errors are bad and using more advanced statistical tools than raw Precentages to avoid making a Type2 error
|
|
|
Post by octaviusmaximus on Oct 5, 2017 1:12:33 GMT
Hey professorlust, I have the barest of statistical knowledge so I have to assume that your numbers are correct but I really don't understand how you are drawing your conclusions. You need to help me understand why you are doing what you are doing. Also, asking whether players had experience with Ghost Fleet? That seems a little odd because the answer is "yes" and "a whole lot", they also still lost. Pointing out that Type 1 errors are bad and using more advanced statistical tools than raw Precentages to avoid making a Type2 error Again... Lacking statistical knowledge. What are those errors? What do you mean?
|
|
|
Post by professorlust on Oct 5, 2017 2:10:33 GMT
Pointing out that Type 1 errors are bad and using more advanced statistical tools than raw Precentages to avoid making a Type2 error Again... Lacking statistical knowledge. What are those errors? What do you mean? bfy.tw/EIlS
|
|
|
Post by octaviusmaximus on Oct 5, 2017 2:16:56 GMT
Again... Lacking statistical knowledge. What are those errors? What do you mean? bfy.tw/EIlSI've noticed that when you got annoyed you use jargon in an attempt to quell anyone talking back to you. Im trying to get you to use basic standards of argumentation. You could have done that when I asked. You chose not to and expended energy undermining your argument instead.
|
|
|
Post by dazzla on Oct 5, 2017 4:44:39 GMT
Just running a quick analysis on the data: There were 1010 games played, which means there were 2020 opportunities for a win Cryx was played 306 times, won 208 times, and lost 98 times Non-Cryx was played 1614 times, won 752 times, and lost 862 times The expected outcome is for both Cryx and non-Cryx to win 50% of the time, that is: Cryx with 153 wins and 153 losses Non-Cryx with 807 wins and 807 losses Running a quick chi-squared test, this is pretty definitively statistically significant -- I'm getting a P-value of 0.000000000007. So, we can almost certainly say that running Cryx in a WTC-style team tournament format, prior to the massive theme drop, will increase your chance of scoring a win in your game. That said, 1. There are differences between team tournaments and solo events, and, 2. Any analysis of WTC data has probably already been rendered useless by the theme drop, which makes Warmachine a whole new game. My gut tells me that we're getting close to the same area as Mad Dogs, Una2, etc, and those all got nerfed, but to be sure, I'd like to see similar results for the next big, post-theme solo event. I feel like that, more than anything, would tell us where we're at. chi squared doesn't do what you think it does. It tells us there's a difference, it doesn't tell us the correlates for that difference or explain difference. Type 1 errors are bad. What we need to ask is not "did cryx do well at the WTC?" but "Did cryx do so well at the WTC to indicate the faction is overpowered?" If you look at my last update, the answer is that Team composition matters more than playing cryx. IMO the regression analysis that you described did not address the question of whether the faction is overpowered, but rather looked at explanation for team performance. I also disagree with the approach. Using player wins as a proxy for player skill sounds intuitive, but I believe raises specification issues when included as a variable in this model, with impacts on the reliability of measures of other variables. But it is a free world and we are all allowed to have different methodologies and views
|
|
wishing
Junior Strategist
Posts: 353
|
Post by wishing on Oct 5, 2017 6:24:51 GMT
Well, I've learned that there is something called a type 1 error and we should avoid making them. Apparently.
|
|
|
Post by oncomingstorm on Oct 5, 2017 6:44:20 GMT
For the record, and to be more helpful than our statistician friend:
Type 1 errors occur when you incorrectly reject the Null hypothesis (ie. find an effect/correlation when there is none.) Type 2 errors occur when you incorrectly fail to reject the Null hypothesis (ie. find that there is no effect/correlation when there in fact is one.)
Type 1 errors are generally considered more egregious in most fields of science.
However, I have some issues with Professorlust's methodology.
In particular, the way he's controlled for player skill seems...off. If winrate is the variable you're using to measure power, then using 'perfect winrate' as a proxy for player skill (and removing 'highly skilled players') is naturally going to have a circular effect on your findings.
|
|
wishing
Junior Strategist
Posts: 353
|
Post by wishing on Oct 5, 2017 7:33:35 GMT
So the "Null hypothesis" is a conclusion that we don't know anything?
|
|
|
Post by jisidro on Oct 5, 2017 7:53:34 GMT
So the "Null hypothesis" is a conclusion that we don't know anything? If you want to test if, absurdly, people can fly... you do this: H0: People cannot fly. H1: People can fly. Type 1 error is saying a person can fly when they can't. Type 2 error is saying a person cannot fly when they can. This comes from the idea that you put the strongest hypothesis in the H0 and the tests tend to make sure you have conclusive evidence to reject what you considered to be right at the start. Kinda of like the presumption of innocence in the legal system.
|
|