Is there a built-in option for detecting outliers in data?

Outliers are determined by the width of the interquartile range (IQR). This range can differ depending on your school of thought but generally a 95% confidence interval of the data can be found in 1.5 IQR above and below the median.

SeedRandom[90807];
data = Join[RandomVariate[NormalDistribution[], 50], 
   RandomVariate[ChiSquareDistribution[3], 10]];

We can calculate this range with Quartiles.

#[[2]] + {-1.5, 1.5} ( #[[3]] - #[[1]]) &@Quartiles[data]
(* {-1.97723, 2.30552} *)

And can use this to Select the outliers from the data

getOutliers[dat_, iqrCoeff_] := 
 Select[! IntervalMemberQ[
      Interval[#[[2]] + {-1, 1} iqrCoeff ( #[[3]] - #[[1]]) &@
        Quartiles[dat]], #] &]@dat

Then

getOutliers[data, 1.5]
(* {-2.01804, 6.76676, 2.38043, 3.4204, 6.19569, 4.85708, 3.58404, 2.99772} *)

Since you may want to identify a different level of confidence interval, BoxWhiskerChart gives you the option to alter the IQR coefficient in its ChartElementFunction option.

BoxWhiskerChart[data, "Outliers", 
 ChartElementFunction -> ChartElementDataFunction["BoxWhisker", "IQRCoefficient" -> 1]]

enter image description here

And

getOutliers[data, 1]

{-1.77297, 1.96271, -1.46257, -1.29773, -2.01804, -1.49219, 6.76676, 
 2.38043, 3.4204, 6.19569, 4.85708, 3.58404, 2.99772}

You will notice that BoxWhiskerChart takes a little presentation license and does not plot the outliers that would print too close to the whisker.

Hope this helps.


The help for BoxWhiskerChart is not explicit but suggests that it defines outliers as more than 1.5 interquartile ranges above/below the third/first quartile. Far outliers are 3 interquartile ranges outside this region.

I offer the following implementation of this

outlierdistance[x_List] := Module[{lq, med, uq},
  {lq, med, uq} = Quartiles[x]; (Ramp[x - uq] + Ramp[lq - x])/(uq - lq)
  ]
outlier[x_List] := Pick[x, Thread[outlierdistance[x] > 1.5]]