The [ppm] eyrie

June 3, 2009

What the hell are these numbers!?

Filed under: PPM.statistics, Uncategorized — glanvalleyeaglets @ 10:30 am

Honestly, would you say you believe in statistics? I was taught to distrust any statistics that I didn’t make up myself. (98.3% of all statistics are actually made up.)

When I was younger, I was forced to have some lessons in the noble subject of mathematical statistics after my epic fail at the oral exam in probability theory, so I can hardly be accused of being in love with the matter which Laplace has called “common sense reduced to calculus”.

So, do you believe in statistics? Does the old sailor believe in the drag of the wind? Break the wind down to the molecules and you’ll see miriads of them releasing their kinetic energy to the wrong side of the sails. Break the great laws of Powerplay Manager down to single games and you’ll see nothing but Random. I bet I’ll deny the cosmic order when the mighty Random hits me in midair at the least appropriate time and that will surely happen provided that this is a part of the cosmic ppm order.

The topic is – how do we measure the weight of the wind given the molecules?

Let us start with an assumption. The game results are computed by some sophisticated, possibly probabilistic algorithm. (No malevolent demon or creature making fun of our feeble attempts to understand the natural laws.)

Consequently, given the values of all the visible and hidden variables that can influence the game, including but not limited to the skill attributes and seasonal energy of all the players engaged in the game and the tactical options chosen by the managers, there exists a well-defined probability for each event expressed in terms of result of the game, e.g., the probability of the event “team A wins” or of the event “the total number of shorthand goals scored by both teams in the game lies between 1 and 3”. The probability exists but remains unknown to us.

Let us first consider the idealized case that all teams are equal. This is certainly not true now and will be even less true in the future as the difference in rates of team development plays a more and more prominent role. Suppose that we observe n games, where team with tactics A plays vs a team with tactics B. Let us choose an event, e.g., that team A wins or that the game ends in overtime. Let us define the random variable X_k, which takes the value X_k=1 if the event happens in the k-th game and X_k=0 if it does not.

How would you estimate the probability of the event from the data (X_k)_{k=1}^{n}? Of course, you would count the cases when the event happened and divide by the total number of the games, i.e., you’d take the sample mean M:=\frac{\sum_{k=1}^nX_k} {n}. Let us ask the obvious question: how good is this estimate?

Evidently, the random variable nM is the sum of n Bernoulli random variables and hence has a binomial distribution. Let the true and unknown expectation of X_k for any k be p, then M has the expectation \mathbb{E}(M)=p and variance \sigma^2=p(1-p).

By the Central Limit Theorem we know that the random variable
Z:=\frac{M-p}{\left[M(1-M)/n\right]^{1/2}} \to N(0,1), n\to \infty
asymptotically (for large n) tends to the standard Gaussian distribution. From here we can express
p=M-Z\sqrt{\frac{M(1-M)}{n}},
but we know that Z \sim N(0,1) that gives us a weapon to compute the confidence intervals for our estimates!

For example, there is a 95 \% chance that Z\in (-1.96,+1.96), see here for other values (in this table you should look up the value of half of the probability, e.g., \frac{0.95}{2}=0.475 since the integration starts from zero.)

Let us assume that M=0.4 (a typical value in our statistics study), then for n=200 we get a \pm error of 0.0679, i.e., less than 7\%. For n=1000 this 95\% confidence interval halfwidth is 0.030 so that I dare say that a result of 50/30 in our data is a statistically significant difference while 44/41 is not!

So you still do not believe in statistics? Well, all of the above was true, had our random variables X_k had equal distributions. Alas, in the real virtual PPM life the outcome of a game with Tactics A vs Tactics B depends on so many additional factors, that we may fairly assume that for each k the expectation \mathbb{E}X_k=p_k is different. A generalization of the central limit theorem still holds, but what we are estimating is the value of n^{-1}\sum_{k} p_k. This value depends on the everchanging distribution of player skills, energies, injuries, teams that play this tactics – countertactics pair. In short, it depends on the available sample pool.

As long as the parameter distribution in this sample pool is neat enough, our data will give a faithful estimate of the performance of Tactics A vs Tactics B for “average” teams. I hope I have convinced you and thus can spare more detailed formulations and mathematical proofs of the corresponding results and conclude that there is a very good chance that our tactics-countertactics tables actually “work” for an “average” team, quod erat demonstrandum by this article.

Advertisements

2 Comments »

  1. So.. that was statistics, alright. Can’t say I followed too much of it, if mostly for lack of trying.

    But if the statistic holds for ‘the average team’, how will we know what that is? Just the mean of all samples? I wonder how I’ll be able to relate my own team to that.

    Comment by Dessaro — June 12, 2009 @ 12:23 pm

    • Hi Dessaro,

      yes, the statistics is just a sum of results over thousands of games played by various teams – each with a unique distribution of players’ skill values. You are making a good point when doubting about the use of tactical tables for a particular team. The only answer I can give is – each of us should seek and find his own way how to deal with tactics. For me, the table is a useful starting point, like a lamp post 🙂 For a while it may be a fairly good support, but later the lamplight can guide you to new shores. I mean, you might try to refine and personalize the tactical schemes by trying out various things with your very own team. Actually I am happy to say that this table is NOT AT ALL the final word in the tactical lore. Makes the game challenging and interesting, doesn’t it? Have fun and good luck! 🙂

      Comment by glanvalleyeaglets — June 13, 2009 @ 7:44 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: