How to calculate a standard deviation [array]

You already have some good answers on calculating standard deviation, but I'd like to add Knuth's algorithm for calculating variance to the list. Knuth's algo performs the calculation in a single pass over the data. Standard deviation is then just the square root of variance, as pointed out above. Knuth's algorithm also allows you to calculate intermediate values of the variance as you go, if that proves useful.

Re: "Fast-Scan if the values follows a cumulative pattern," if your data is expected to grow linearly, I'd suggest computing a mean and variance for the difference between successive elements (10.5, 10.4 and 23.0 would be the first three difference values from your data). Then find outliers of these difference values instead of the data points. This will make anomalous data values like 1400.32 in your example much more evident, especially when the data eventually grows large enough that 1400 is near the mean.


Given the outliers, you might find the interquartile range to be more useful than the standard deviation. This is simple to calculate: just sort the numbers and find the difference of the values at the 75th percentile and the 25th percentile.


To calculate standard deviation you can use this code. Taken directly from Calculate Standard Deviation of Double Variables in C# by Victor Chen.

private double getStandardDeviation(List<double> doubleList)  
{  
   double average = doubleList.Average();  
   double sumOfDerivation = 0;  
   foreach (double value in doubleList)  
   {  
      sumOfDerivation += (value) * (value);  
   }  
   double sumOfDerivationAverage = sumOfDerivation / (doubleList.Count - 1);  
   return Math.Sqrt(sumOfDerivationAverage - (average*average));  
}  

This link to Victor's site no longer works, but is still included to help maintain attribution.


Using LINQ:

double average = someDoubles.Average();
double sumOfSquaresOfDifferences = someDoubles.Select(val => (val - average) * (val - average)).Sum();
double sd = Math.Sqrt(sumOfSquaresOfDifferences / someDoubles.Length); 

The sd variable will have the standard deviation.

If you have a List<double>, then use someDoubles.Count in the last line for code instead of someDoubles.Length.

Tags:

C#

.Net

Algorithm