Explanation for the Wilson Score Interval?

The explanation of "interval equality principle" was impossible for me to readily understand. However, it is not needed to know why the Wilson score interval works. The Wilson interval is derived from the Wilson Score Test, which belongs to a class of tests called Rao Score Tests. It relies on the asymptotic normality of your estimator, just as the Wald interval does, but it is more robust to deviations from normality. Case in point: Wald intervals are always symmetric (which may lead to binomial probabilties less than 0 or greater than 1), while Wilson score intervals are assymetric.

Wilson intervals get their assymetry from the underlying likelihood function for the binomial, which is used to compute the "expected standard error" and "score" (i.e., first derivative of the likelihood function) under the null hypotheisis. Since these values will change as you very your null hypothesis, the interval where the normalized score (score/expected standard error) exceeds your pre-specified Z-cutoff for significance will not be symmetric, in general.

In basic terms, the Wilson interval uses the data more efficiently, as it does not simply aggregate them into a a single mean and standard error, but uses the data to develop a likelihood function that is then used to develop an interval.


The difference between the Wald and Wilson interval is that each is the inverse of the other. This is how the Wilson interval is derived!

As a result we have the following type of equality, which I referred to as the interval equality principle to try to get this idea across.

Wald(tail, α, Wilson(¬tail, α, p)) = p,

and, correspondingly,

Wilson(tail, α, Wald(¬tail, α, P)) = P,

where tail ε {0=lower, 1=upper}, α represents the error level (e.g. 1 in 100 = 0.01), and p is an observed probability ε [0, 1]. The Wald interval is a legitimate approximation to the Binomial interval about an expected population probability P, but (naturally) a wholly inaccurate approximation to its inverse about p (the Clopper-Pearson interval).

In fitting contexts it is legitimate to employ a Wald interval about P because we model an ideal P and compute the fit from there. But when we plot observed p, we need to employ the Wilson interval.

I would encourage people to read the paper, not just the excerpt!

Sean Wallis

Tags:

Statistics