crimsyn
Junior Strategist
Posts: 389
|
Post by crimsyn on Oct 4, 2017 5:23:14 GMT
Just running a quick analysis on the data:
There were 1010 games played, which means there were 2020 opportunities for a win
Cryx was played 306 times, won 208 times, and lost 98 times Non-Cryx was played 1614 times, won 752 times, and lost 862 times
The expected outcome is for both Cryx and non-Cryx to win 50% of the time, that is: Cryx with 153 wins and 153 losses Non-Cryx with 807 wins and 807 losses
Running a quick chi-squared test, this is pretty definitively statistically significant -- I'm getting a P-value of 0.000000000007. So, we can almost certainly say that running Cryx in a WTC-style team tournament format, prior to the massive theme drop, will increase your chance of scoring a win in your game.
That said, 1. There are differences between team tournaments and solo events, and, 2. Any analysis of WTC data has probably already been rendered useless by the theme drop, which makes Warmachine a whole new game.
My gut tells me that we're getting close to the same area as Mad Dogs, Una2, etc, and those all got nerfed, but to be sure, I'd like to see similar results for the next big, post-theme solo event. I feel like that, more than anything, would tell us where we're at.
|
|
zich
Junior Strategist
Posts: 690
|
Post by zich on Oct 4, 2017 7:28:39 GMT
I see. Maybe I'm not clever enough for this, but please bear with me: Why do you attribute player wins solely to player skill when faction choice having an influence on this can certainly not be ruled out? Or is that not what you are doing? Or in other words: How do you decouple the factors of player strength and faction strength in their contribution towards a players wins? Welcome to the quandaries of social science and the joys of post hoc analysis. Firstly, Here's a list of some the things we don't know about: Which teams chose which matchups? How many games against Cryx each team and each player played in Preparation for the WTC How each team viewed Cryx and how they worked to mitigate the faction. How many teams who took cryx did so because that player exclusively chose cryx for the "win more" (since WTC as a premier competition allows for more explicit "win more" attitude than might tolerated at LGS without such aspirations) Conversely why teams that didn't take cryx chose not to do so: i.e. cost to acquire models, effort to learn a new faction, not wanting be a "win more" player? etc. Any of these factors confound performance whether at a faction, team, or individual level. Secondly, "skill" in Warmachine is a really nebulous topic due to a number of factors. Factors that we've either refused to adopt as a global community (ELO tracking for players) or we are unable to track due to lack of proper collection mechanisms such as what actually happens in a particular game: dice spikes and their timing, the scenarios impact of scenarios in positioning, the impact of terrain etc With these two things in mind, if we want to investigate relationshipsand not just speculate about them, we have to do more than look at descriptive statistics. So I chose as a proxy for player "skill" the number of wins a player earned to determine if that was a factor on a teams overall performance. While there is collinearity between faction and individual player wins, there wasn't significant collinearity. Granted, I was using basic linear regression for this analysis and plan on using partial least squares regression and well as looking at F change as variables get added to the model to determine the impact of particular variables I see. In part at least. One last question. Us non-mathematically-minded have arrived at the very simple result that (discarding mirror matches) Cryx players win 71% of their games. I initially merely (mis)understood you as saying "yes, but here's why that's wrong". Now let me try a second shot at understanding your premise and results. What you are examining is not the power of the faction in a vacuum, but its value to a team in the matchup process. And you conclude this to be less impactful than the apparent power of Cryx in single games. Is that right? In fact, allow me another question. You say there is very little correlation between a player's faction and their wins. How can this be when some factions only win ~30% of their games and others get as hich as ~70%?
|
|
|
WTC Stats
Oct 4, 2017 11:57:57 GMT
via mobile
Post by professorlust on Oct 4, 2017 11:57:57 GMT
Just running a quick analysis on the data: There were 1010 games played, which means there were 2020 opportunities for a win Cryx was played 306 times, won 208 times, and lost 98 times Non-Cryx was played 1614 times, won 752 times, and lost 862 times The expected outcome is for both Cryx and non-Cryx to win 50% of the time, that is: Cryx with 153 wins and 153 losses Non-Cryx with 807 wins and 807 losses Running a quick chi-squared test, this is pretty definitively statistically significant -- I'm getting a P-value of 0.000000000007. So, we can almost certainly say that running Cryx in a WTC-style team tournament format, prior to the massive theme drop, will increase your chance of scoring a win in your game. That said, 1. There are differences between team tournaments and solo events, and, 2. Any analysis of WTC data has probably already been rendered useless by the theme drop, which makes Warmachine a whole new game. My gut tells me that we're getting close to the same area as Mad Dogs, Una2, etc, and those all got nerfed, but to be sure, I'd like to see similar results for the next big, post-theme solo event. I feel like that, more than anything, would tell us where we're at. chi squared doesn't do what you think it does. It tells us there's a difference, it doesn't tell us the correlates for that difference or explain difference. Type 1 errors are bad. What we need to ask is not "did cryx do well at the WTC?" but "Did cryx do so well at the WTC to indicate the faction is overpowered?" If you look at my last update, the answer is that Team composition matters more than playing cryx.
|
|
|
WTC Stats
Oct 4, 2017 11:58:26 GMT
via mobile
Post by professorlust on Oct 4, 2017 11:58:26 GMT
Welcome to the quandaries of social science and the joys of post hoc analysis. Firstly, Here's a list of some the things we don't know about: Which teams chose which matchups? How many games against Cryx each team and each player played in Preparation for the WTC How each team viewed Cryx and how they worked to mitigate the faction. How many teams who took cryx did so because that player exclusively chose cryx for the "win more" (since WTC as a premier competition allows for more explicit "win more" attitude than might tolerated at LGS without such aspirations) Conversely why teams that didn't take cryx chose not to do so: i.e. cost to acquire models, effort to learn a new faction, not wanting be a "win more" player? etc. Any of these factors confound performance whether at a faction, team, or individual level. Secondly, "skill" in Warmachine is a really nebulous topic due to a number of factors. Factors that we've either refused to adopt as a global community (ELO tracking for players) or we are unable to track due to lack of proper collection mechanisms such as what actually happens in a particular game: dice spikes and their timing, the scenarios impact of scenarios in positioning, the impact of terrain etc With these two things in mind, if we want to investigate relationshipsand not just speculate about them, we have to do more than look at descriptive statistics. So I chose as a proxy for player "skill" the number of wins a player earned to determine if that was a factor on a teams overall performance. While there is collinearity between faction and individual player wins, there wasn't significant collinearity. Granted, I was using basic linear regression for this analysis and plan on using partial least squares regression and well as looking at F change as variables get added to the model to determine the impact of particular variables I see. In part at least. One last question. Us non-mathematically-minded have arrived at the very simple result that (discarding mirror matches) Cryx players win 71% of their games. I initially merely (mis)understood you as saying "yes, but here's why that's wrong". Now let me try a second shot at understanding your premise and results. What you are examining is not the power of the faction in a vacuum, but its value to a team in the matchup process. And you conclude this to be less impactful than the apparent power of Cryx in single games. Is that right? In fact, allow me another question. You say there is very little correlation between a player's faction and their wins. How can this be when some factions only win ~30% of their games and others get as hich as ~70%? Look at my things we don't know discussion, especially as it relates to preparation. How games did each WTC players have against Post CID cryx? We don't know? Did anyone change their planned WTC lists after playing Post CID cryx? Etc Is there perfect balance in Warmachine? No, but that's a foolhardy quest. There are number of factors that influence outcomes of games, especially in a team environment. Now I'm not arguing that Cryx, and Denny1 in particular, shouldn't be rebalanced. I'm on record at the FiretruckFort for supporting said nerfs. I am saying stop making Type 1 errors because you're scared by the outcome of a non-standard competitive environment
|
|
|
Post by jisidro on Oct 4, 2017 12:10:32 GMT
chi squared doesn't do what you think it does. It tells us there's a difference, it doesn't tell us the correlates for that difference or explain difference. Type 1 errors are bad. What we need to ask is not "did cryx do well at the WTC?" but "Did cryx do so well at the WTC to indicate the faction is overpowered?" If you look at my last update, the answer is that Team composition matters more than playing cryx. He got an infinitesimal chance that the record of wins/loses by Cryx at the WTC is the result of a 50% of winning any match-up... It may not answer a lot of questions but if you run the test for the other factions you get an idea of who performed above/below expectations and can compare p_values to try and gain some perspective.
|
|
|
WTC Stats
Oct 4, 2017 12:21:58 GMT
via mobile
Post by professorlust on Oct 4, 2017 12:21:58 GMT
chi squared doesn't do what you think it does. It tells us there's a difference, it doesn't tell us the correlates for that difference or explain difference. Type 1 errors are bad. What we need to ask is not "did cryx do well at the WTC?" but "Did cryx do so well at the WTC to indicate the faction is overpowered?" If you look at my last update, the answer is that Team composition matters more than playing cryx. He got an infinitesimal chance that the record of wins/loses by Cryx at the WTC is the result of a 50% of winning any match-up... It may not answer a lot of questions but if you run the test for the other factions you get an idea of who performed above/below expectations and can compare p_values to try and gain some perspective. No he ran a test not of winning matchups but of total wins expected overall. Substantially different questions. If you actually break it down via dummy coding (do need to look at via effect coding) the effect of playing cryx and winning any particular game is non significant. Indeed even when we look at total games a team won, while in isolarlrion cryx looks strong, having both on your team and the players on the teams "skill" is far more important than simply playing cryx at the WTC. so much so that when we hold player wins and team composition constant, playing as cryx had a negative effect on your teams total matches won (larger if dummy coded, slightly smaller if effect coded, both negative both significant
|
|
crimsyn
Junior Strategist
Posts: 389
|
WTC Stats
Oct 4, 2017 12:46:18 GMT
via mobile
Post by crimsyn on Oct 4, 2017 12:46:18 GMT
Just running a quick analysis on the data: There were 1010 games played, which means there were 2020 opportunities for a win Cryx was played 306 times, won 208 times, and lost 98 times Non-Cryx was played 1614 times, won 752 times, and lost 862 times The expected outcome is for both Cryx and non-Cryx to win 50% of the time, that is: Cryx with 153 wins and 153 losses Non-Cryx with 807 wins and 807 losses Running a quick chi-squared test, this is pretty definitively statistically significant -- I'm getting a P-value of 0.000000000007. So, we can almost certainly say that running Cryx in a WTC-style team tournament format, prior to the massive theme drop, will increase your chance of scoring a win in your game. That said, 1. There are differences between team tournaments and solo events, and, 2. Any analysis of WTC data has probably already been rendered useless by the theme drop, which makes Warmachine a whole new game. My gut tells me that we're getting close to the same area as Mad Dogs, Una2, etc, and those all got nerfed, but to be sure, I'd like to see similar results for the next big, post-theme solo event. I feel like that, more than anything, would tell us where we're at. chi squared doesn't do what you think it does. It tells us there's a difference, it doesn't tell us the correlates for that difference or explain difference. Type 1 errors are bad. What we need to ask is not "did cryx do well at the WTC?" but "Did cryx do so well at the WTC to indicate the faction is overpowered?" If you look at my last update, the answer is that Team composition matters more than playing cryx. All I said was that playing Cryx in this format, you're more likely to score a win. That said, I'm struggling for an alternative explanation for this difference than a couple Cryx lists being above the power curve or otherwise hard to counter. The format itself more or less controls for skill already by being one that only the top players in the world in their countries attend, so I don't think it is likely that there are that huge of variations in skill. Also, it seems unlikely that it just happened to be that all the best players play Cryx and all the not-so-great (relatively speaking) players play other factions, or that all the Cryx players had really hot dice all weekend long. I've never played in a 5-man team event like the WTC, so I'm not familiar with all the complications from pairing, etc., so that could be another explanation. But even in that case it also tells us something. Everyone knew Cryx would be popular at the WTC and would be a faction to tech against, and in spite of that, they still had this amazing w/l record. If, in spite of having plenty of time to prepare as knowing about GF and DH for months, a bunch of team captains, who are some of the best players in the world, decided that their best path to victory was to throw someone under the bus and try to go 3-1 in your other games, there are some pretty strong implications there as well. ...that is, if the game didn't change massively a week later.
|
|
|
WTC Stats
Oct 4, 2017 13:09:45 GMT
via mobile
Post by professorlust on Oct 4, 2017 13:09:45 GMT
chi squared doesn't do what you think it does. It tells us there's a difference, it doesn't tell us the correlates for that difference or explain difference. Type 1 errors are bad. What we need to ask is not "did cryx do well at the WTC?" but "Did cryx do so well at the WTC to indicate the faction is overpowered?" If you look at my last update, the answer is that Team composition matters more than playing cryx. All I said was that playing Cryx in this format, you're more likely to score a win. That said, I'm struggling for an alternative explanation for this difference than a couple Cryx lists being above the power curve or otherwise hard to counter. The format itself more or less controls for skill already by being one that only the top players in the world in their countries attend, so I don't think it is likely that there are that huge of variations in skill. Also, it seems unlikely that it just happened to be that all the best players play Cryx and all the not-so-great (relatively speaking) players play other factions, or that all the Cryx players had really hot dice all weekend long. I've never played in a 5-man team event like the WTC, so I'm not familiar with all the complications from pairing, etc., so that could be another explanation. But even in that case it also tells us something. Everyone knew Cryx would be popular at the WTC and would be a faction to tech against, and in spite of that, they still had this amazing w/l record. If, in spite of having plenty of time to prepare as knowing about GF and DH for months, a bunch of team captains, who are some of the best players in the world, decided that their best path to victory was to throw someone under the bus and try to go 3-1 in your other games, there are some pretty strong implications there as well. ...that is, if the game didn't change massively a week later. Still not what chi squared exmaines. It's doesn't test directionality it tests difference. I've run the chi squared and yes there is significant difference in the number of wins both between field as a whole and when we break it down to Just between cryx and the rest of the field. This doesn't answer any directional, correlative, or causal questions. Also How many people actually had experience against ghost fleet? Sure they read it on the internet but how many games did they play against it and how if at all did they adjust their lists.
|
|
crimsyn
Junior Strategist
Posts: 389
|
WTC Stats
Oct 4, 2017 13:16:13 GMT
via mobile
Post by crimsyn on Oct 4, 2017 13:16:13 GMT
I suppose it could be possible that the Cryx win rate is mostly due to the pairing process, but even that tells us something.
From my experience, I get the feeling that playing against a player of more or less equal skill, the Cryx matchup demands some very specific tech to even make a game of it. Whether this is overpowered or not, having such a bogeyman out there in the meta that it forces your hand on list design to this extent (do you have magic weapons AND RFP?) this much feels like a pretty big issue with regards to game design.
Honestly, my hope when it comes to Mk.IV is that they simplify the death mechanics a bit... having three separate states of dying after you run out of hp, and having to have two different dead piles to account for models that have been RFP'ed rather than killed seems... inelegant
|
|
|
Post by fdf86 on Oct 4, 2017 13:21:02 GMT
I'm curious, whats the final records of ppl who played vs denny and coven? Like did a lot of ppl finish 0-6?
|
|
|
Post by jisidro on Oct 4, 2017 13:21:22 GMT
Still not what chi squared exmaines. It's doesn't test directionality it tests difference. I've run the chi squared and yes there is significant difference in the number of wins when between cryx and the rest of the field. This doesn't answer any directional, correlative, or causal questions. It answers one basic question. Did Cryx win way more than expected (Not really because it went in as the bogeyman...) if everygame had a 50/50 chance on going either way? Apparently so. Nothing about the reasons. Also How many people actually had experience against ghost fleet? Sure they read it on the internet but how many games did they play against it and how if at all did they adjust their lists. This sounds like a Cryxian argument... Who knows? It wasn't a surprise at the WTC, it's performance was well known... All things being equal why go with people were unprepared instead of people were prepared? 320 players, 160 games each round. 80 wins each round, cryx What do you mean here?
|
|
|
Post by Azuresun on Oct 4, 2017 14:05:21 GMT
Also How many people actually had experience against ghost fleet? Sure they read it on the internet but how many games did they play against it and how if at all did they adjust their lists. Assuming that players are just losing to Ghost Fleet for lack of training against it (and none of them thought hey, maybe I should get some games in against the list that's been the bogeyman for months), that's still saying that Cryx have an advantage. After all, there should have been at least some lists in other factions that caught people by surprise, so why didn't those lists produce an equally dramatic skew in results for their factions? It's hard to see any other conclusion other than "Cryx do it better".
|
|
|
Post by smoothcriminal on Oct 4, 2017 14:15:24 GMT
Seems like a load of bullshit to me. We're in trolling territory here.
|
|
|
WTC Stats
Oct 4, 2017 14:36:07 GMT
via mobile
Post by professorlust on Oct 4, 2017 14:36:07 GMT
Still not what chi squared exmaines. It's doesn't test directionality it tests difference. I've run the chi squared and yes there is significant difference in the number of wins when between cryx and the rest of the field. This doesn't answer any directional, correlative, or causal questions. It answers one basic question. Did Cryx win way more than expected (Not really because it went in as the bogeyman...) if everygame had a 50/50 chance on going either way? Apparently so. Nothing about the reasons. Also How many people actually had experience against ghost fleet? Sure they read it on the internet but how many games did they play against it and how if at all did they adjust their lists. This sounds like a Cryxian argument... Who knows? It wasn't a surprise at the WTC, it's performance was well known... All things being equal why go with people were unprepared instead of people were prepared? 320 players, 160 games each round. 80 wins each round, cryx What do you mean here? 1) No game will be 50:50 in Warmachine, in part due to design effects but in large parts to factors beyond designers control. any attempt to cajole PP to design in a way so that every major competitive format has no significant difference in Faction performance is silly. 2) im not a cryx player if that's what you're assuming. I'm playing circle right now, played menoth for most of 2017. Last time I played cryx was November 2016 and won a 32 player event with Coven and Skarre1. Why these factors like experiencing a match up matters is that knowing a thing is very different than doing a thing. As I noted before point is that there's a lot of suppositions about what's going on in the Meta but almost all it are hyperbolic assumptions based in descriptives or until recently weak inferential tests like Chi Squared. If people want to chicken little and make Type 1 errors, by all means go for it. 3) incomplete thought that I didn't have time to articulate, so I deleted it I've
|
|
zich
Junior Strategist
Posts: 690
|
Post by zich on Oct 4, 2017 14:46:26 GMT
If people want to chicken little and make Type 1 errors, by all means go for it. So in plain English, what you are saying is: "Cryx players winning a lot could have other causes than Cryx being a strong faction." Is that correct? Because I can see no other way for the conclusion of "Cryx wins a lot -> Cryx is strong" being invalid.
|
|