Calculate variance for a lottery?
In response to Daniel Huber's answer: this is a case of which variance estimator you want to use. The easiest way to see this, is to make the probabilities all the same, so that the weights don't actually do anything:
payouts = {7, 3, 1, 0};
probabilities = {1/4, 1/4, 1/4, 1/4};
Total[probabilities]
1
These two are now the same (as they should be):
Variance[payouts]
Variance[WeightedData[payouts, probabilities]]
115/12
115/12
The other calculation gives something different:
mean = payouts . probabilities
((payouts - mean)^2) . probabilities
115/16
That's because that formula is for the maximum-likelihood estimate of the variance, i.e. the 1/n
estimate instead of the 1/(n - 1)
one:
Total[((payouts - mean)^2)]/Length[payouts]
115/16
I'm not going into the details of the pros and cons of each estimator, but it suffices to say that Mathematica uses the unbiased one.
Edit
It seems there is some confusion about the intended use case of WeightedData
. This function is for representing data obtained from a sample, as the name suggests. If you want to calculate the variance of a theoretical distribution where the payouts and weights are exactly known, WeightedData
is not the correct way to represent this distribution. You need to use EmpiricalDistribution
instead:
payouts = {7, 3, 1, 0};
probabilities = {1/6, 1/6, 1/3, 1/3};
Variance @ EmpiricalDistribution[probabilities -> payouts]
6