Honestly, would you say you believe in statistics? I was taught to distrust any statistics that I didn’t make up myself. (*98.3% of all statistics are actually made up.*)

When I was younger, I was forced to have some lessons in the noble subject of mathematical statistics after my epic fail at the oral exam in probability theory, so I can hardly be accused of being in love with the matter which Laplace has called “common sense reduced to calculus”.

So, do you believe in statistics? Does the old sailor believe in the drag of the wind? Break the wind down to the molecules and you’ll see miriads of them releasing their kinetic energy to the wrong side of the sails. Break the great laws of Powerplay Manager down to single games and you’ll see nothing but Random. I bet I’ll deny the cosmic order when the mighty Random hits me in midair at the least appropriate time and that will surely happen provided that this is a part of the cosmic ppm order.

The topic is – how do we measure the weight of the wind given the molecules?

Let us start with an assumption. The game results are computed by some sophisticated, possibly probabilistic algorithm. (No malevolent demon or creature making fun of our feeble attempts to understand the natural laws.)

Consequently, given the values of all the visible and hidden variables that can influence the game, including but not limited to the skill attributes and seasonal energy of all the players engaged in the game and the tactical options chosen by the managers, there exists a well-defined probability for each event expressed in terms of result of the game, e.g., the probability of the event “team A wins” or of the event “the total number of shorthand goals scored by both teams in the game lies between 1 and 3”. The probability exists but remains unknown to us.

Let us first consider the idealized case that all teams are equal. This is certainly not true now and will be even less true in the future as the difference in rates of team development plays a more and more prominent role. Suppose that we observe games, where team with tactics plays vs a team with tactics . Let us choose an event, e.g., that team wins or that the game ends in overtime. Let us define the random variable , which takes the value if the event happens in the -th game and if it does not.

How would you estimate the probability of the event from the data ? Of course, you would count the cases when the event happened and divide by the total number of the games, i.e., you’d take the sample mean . Let us ask the obvious question: how good is this estimate?

Evidently, the random variable is the sum of Bernoulli random variables and hence has a binomial distribution. Let the true and unknown expectation of for any be , then has the expectation and variance .

By the Central Limit Theorem we know that the random variable

asymptotically (for large ) tends to the standard Gaussian distribution. From here we can express

but we know that that gives us a weapon to compute the confidence intervals for our estimates!

For example, there is a chance that , see here for other values (in this table you should look up the value of half of the probability, e.g., since the integration starts from zero.)

Let us assume that (a typical value in our statistics study), then for we get a error of , i.e., less than . For this confidence interval halfwidth is so that I dare say that a result of in our data is a statistically significant difference while is not!

So you still do not believe in statistics? Well, all of the above was true, had our random variables had equal distributions. Alas, in the real virtual PPM life the outcome of a game with Tactics A vs Tactics B depends on so many additional factors, that we may fairly assume that for each the expectation is different. A generalization of the central limit theorem still holds, but what we are estimating is the value of . This value depends on the everchanging distribution of player skills, energies, injuries, teams that play this tactics – countertactics pair. In short, it depends on the available sample pool.

As long as the parameter distribution in this sample pool is neat enough, our data will give a faithful estimate of the performance of Tactics A vs Tactics B for “average” teams. I hope I have convinced you and thus can spare more detailed formulations and mathematical proofs of the corresponding results and conclude that there is a very good chance that our tactics-countertactics tables actually “work” for an “average” team, quod erat demonstrandum by this article.

So.. that was statistics, alright. Can’t say I followed too much of it, if mostly for lack of trying.

But if the statistic holds for ‘the average team’, how will we know what that is? Just the mean of all samples? I wonder how I’ll be able to relate my own team to that.

Comment by Dessaro — June 12, 2009 @ 12:23 pm

Hi Dessaro,

yes, the statistics is just a sum of results over thousands of games played by various teams – each with a unique distribution of players’ skill values. You are making a good point when doubting about the use of tactical tables for a particular team. The only answer I can give is – each of us should seek and find his own way how to deal with tactics. For me, the table is a useful starting point, like a lamp post 🙂 For a while it may be a fairly good support, but later the lamplight can guide you to new shores. I mean, you might try to refine and personalize the tactical schemes by trying out various things with your very own team. Actually I am happy to say that this table is NOT AT ALL the final word in the tactical lore. Makes the game challenging and interesting, doesn’t it? Have fun and good luck! 🙂

Comment by glanvalleyeaglets — June 13, 2009 @ 7:44 pm