How to draw a hanging rootogram in Mathematica?
ClearAll[hangingRootogram]
hangingRootogram[dat_, estdist_, binspec_: Automatic][sc___ : .9, o: OptionsPattern[]] :=
With[{hd = HistogramDistribution[dat, binspec], bins = HistogramList[dat, binspec][[1]]},
With[{es = sc Min@Differences@bins},
DiscretePlot[{Sqrt@PDF[estdist, x] - Sqrt@PDF[hd, x], Sqrt@PDF[estdist, x]}, {x, bins},
ExtentSize -> es, PlotMarkers -> {None, {"Point", Large}}, Joined -> {False, True},
Filling -> {1 -> {2}}, o]]]
Examples:
data = RandomVariate[NegativeBinomialDistribution[10, 0.3], 10^2];
edist = EstimatedDistribution[data, NegativeBinomialDistribution[n, p],
ParameterEstimator -> "MethodOfMoments"];
Row[{Histogram[data, Automatic, "PDF", ImageSize -> 400],
hangingRootogram[data, edist][.8, ImageSize -> 400,
PlotStyle -> {Blue, Red},
FillingStyle -> Directive[Opacity[.7], Blue, EdgeForm[{Blue, Thick}]]]}]
Row[hangingRootogram[data, EstimatedDistribution[data, #,
ParameterEstimator -> "MethodOfMoments"]][.8, ImageSize -> 400,
PlotStyle -> {Blue, Red}, PlotLabel -> #,
FillingStyle -> Directive[Opacity[.7], Blue, EdgeForm[{Blue, Thick}]],
PlotRange -> Full] & /@
{NegativeBinomialDistribution[n, p], NegativeBinomialDistribution[n, .5],
PoissonDistribution[n]}]
data = RandomVariate[PoissonDistribution[5], 10^3];
edist = EstimatedDistribution[data, PoissonDistribution[n],
ParameterEstimator -> "MethodOfMoments"];
Row[{Histogram[data, {3}, "PDF", ImageSize -> 400],
hangingRootogram[data, edist, {3}][.8, ImageSize -> 400,
PlotStyle -> {Blue, Red},
FillingStyle -> Directive[Opacity[.7], Blue, EdgeForm[{Blue, Thick}]]]}]
The defined function RootHistogram
makes a "hanging rootogram" more-or-less following this definition.
The first argument is the data. The second argument dist
is optional distribution. The function uses SmoothHistogram
for the hanging curve and the third argument, bandWidth
, is the band width argument of SmoothHistogram
. The bspec
argument is given to HistogramList
. The sqRoot
argument is in adherence to the mentioned definition:
[...] As in the rootogram, the vertical axis is scaled to the square-root of the frequencies so as to draw attention to discrepancies in the tails of the distribution.
Clear[RootHistogram]
RootHistogram[data : {_?NumberQ ..}, dist_: Automatic,
bandWidth_: "StandardDeviation", bspec_: Automatic,
sqRoot : (True | False) : True, opts : OptionsPattern[]] :=
Block[{gr, shpoints, nf, x0, x1, s, xs, ds, ps},
gr = SmoothHistogram[data, bandWidth, "Intensity"];
shpoints =
SortBy[Cases[gr[[1]], Line[p_] :> p, \[Infinity]][[1]], First];
If[! TrueQ[dist === Automatic],
ds = Table[PDF[dist, x], {x, shpoints[[All, 1]]}];
ds = Rescale[ds, MinMax[ds], MinMax[shpoints[[All, 2]]]];
shpoints[[All, 2]] = ds
];
If[sqRoot, shpoints[[All, 2]] = Sqrt[shpoints[[All, 2]]]];
nf = Nearest[shpoints[[All, 1]] -> Automatic];
{x0, x1} = MinMax[data];
ps = HistogramList[data, bspec];
ps = Transpose[{Mean /@ Partition[ps[[1]], 2, 1], ps[[2]]}];
If[sqRoot, ps[[All, 2]] = Sqrt[ps[[All, 2]]]];
s = Max[Abs[Differences[ps[[All, 1]]]]];
Graphics[{
GrayLevel[0.7],
Map[Rectangle[{#[[1]] - s/2.5,
shpoints[[nf[#[[1]]][[1]], 2]] - #[[2]]}, {#[[1]] + s/2.5,
shpoints[[nf[#[[1]]][[1]], 2]]}] &, ps],
Blue, Line[Select[shpoints, x0 <= #[[1]] <= x1 &]],
Red, Point[Map[shpoints[[nf[#[[1]]][[1]]]] &, ps]]}, opts,
Axes -> True, AspectRatio -> 1/GoldenRatio]
];
dist = PoissonDistribution[8];
data = RandomVariate[dist, 500];
opts = {ImageSize -> 450, Axes -> False, Frame -> True};
Grid[{{Histogram[data, 20, PlotLabel -> "Histogram", opts],
RootHistogram[data, Automatic, "StandardDeviation", 20, True,
PlotLabel ->
"\!\(\*SqrtBox[\(SmoothHistogram\)]\) with hanging \
\!\(\*SqrtBox[\(HistogramList\)]\) panels", opts]},
{RootHistogram[data, Automatic, "StandardDeviation", 20, False,
PlotLabel -> "SmoothHistogram with hanging HistogramList panels", opts],
RootHistogram[data, NormalDistribution[11, 2],"StandardDeviation", 20, True, PlotLabel ->
"\!\(\*SqrtBox[\(Max[SmoothHistogram] PDF[N[11, 2], x]\)]\) with \
hanging \!\(\*SqrtBox[\(HistogramList\)]\) panels", opts]}}]
I don't know how to interprete scaling of frequencies and associated expected curve so I will just plot PDF. This answer isn't complete then!
Here is a simple way to hang those bars using ChartElementFunction
:
d = NormalDistribution[0, 1]
n = 100
data = RandomVariate[d, n];
bspec = {-5, 5, .5};
f[{{xmin_, xmax_}, {ymin_, ymax_}}, ___] := Module[{
m = Mean@{xmin, xmax}, yMax
},
yMax = PDF[d, m];
{
[email protected],
Translate[Rectangle[{xmin, ymin}, {xmax, ymax}], {0, yMax - ymax}],
AbsolutePointSize@7, Red,
Point[{m, yMax}]
}
];
Show[
Plot[ PDF[d, x], {x, #, #2}] & @@ bspec,(*expected*)
Histogram[data, bspec, (*experimental*)
"PDF",
ChartElementFunction -> f
],
PlotRange -> All,
Frame -> True,
GridLines -> {{}, {0}},
GridLinesStyle -> Thick
]
Of course the more points the better match:
n = 10000