How do I determine the standard deviation (stddev) of a set of values?
While the sum of squares algorithm works fine most of the time, it can cause big trouble if you are dealing with very large numbers. You basically may end up with a negative variance...
Plus, don't never, ever, ever, compute a^2 as pow(a,2), a * a is almost certainly faster.
By far the best way of computing a standard deviation is Welford's method. My C is very rusty, but it could look something like:
public static double StandardDeviation(List<double> valueList)
{
double M = 0.0;
double S = 0.0;
int k = 1;
foreach (double value in valueList)
{
double tmpM = M;
M += (value - tmpM) / k;
S += (value - tmpM) * (value - M);
k++;
}
return Math.Sqrt(S / (k-2));
}
If you have the whole population (as opposed to a sample population), then use return Math.Sqrt(S / (k-1));
.
EDIT: I've updated the code according to Jason's remarks...
EDIT: I've also updated the code according to Alex's remarks...
10 times faster solution than Jaime's, but be aware that, as Jaime pointed out:
"While the sum of squares algorithm works fine most of the time, it can cause big trouble if you are dealing with very large numbers. You basically may end up with a negative variance"
If you think you are dealing with very large numbers or a very large quantity of numbers, you should calculate using both methods, if the results are equal, you know for sure that you can use "my" method for your case.
public static double StandardDeviation(double[] data)
{
double stdDev = 0;
double sumAll = 0;
double sumAllQ = 0;
//Sum of x and sum of x²
for (int i = 0; i < data.Length; i++)
{
double x = data[i];
sumAll += x;
sumAllQ += x * x;
}
//Mean (not used here)
//double mean = 0;
//mean = sumAll / (double)data.Length;
//Standard deviation
stdDev = System.Math.Sqrt(
(sumAllQ -
(sumAll * sumAll) / data.Length) *
(1.0d / (data.Length - 1))
);
return stdDev;
}
The accepted answer by Jaime is great, except you need to divide by k-2 in the last line (you need to divide by "number_of_elements-1"). Better yet, start k at 0:
public static double StandardDeviation(List<double> valueList)
{
double M = 0.0;
double S = 0.0;
int k = 0;
foreach (double value in valueList)
{
k++;
double tmpM = M;
M += (value - tmpM) / k;
S += (value - tmpM) * (value - M);
}
return Math.Sqrt(S / (k-1));
}
The Math.NET library provides this for you to of the box.
PM> Install-Package MathNet.Numerics
var populationStdDev = new List<double>(1d, 2d, 3d, 4d, 5d).PopulationStandardDeviation();
var sampleStdDev = new List<double>(2d, 3d, 4d).StandardDeviation();
See PopulationStandardDeviation for more information.