Concentration inequality for the median

The most convinient way to get concentration results on the median is isoperimetric inequality. Basically it states that among all geometrical shapes with equal area, the circle has the smallest perimeter. But when this result is generalized to a high-dimension metric space with a probability measure, you may get some counter-intuitive results called "measure concentration".

The most famous isoperimetric inequality maybe the Talagrand's concentration inequality. The inequality states that if $\Omega=\Omega_1\times\ldots\times\Omega_n$ is a product space endowed with a product probability measure and $A$ is a subset in this space, then for any $t>0$, there is $$ \Pr[A_t] \cdot \Pr\left[\overline{A_t}\right] \le e^{-t^2/4} \, , $$ where $A_t = \{ x \in \Omega : \rho(A, x) \le t \}$ with Talagrand's convex distance $\rho(\cdot, \cdot)$. You may refer to the wikipedia page to get the rigorous definition of those symbols in the inequality above.

From the inequality above, we can get a important conclusion: most probability mass (the measure!) is concentrated on events with probability at least $1/2$. For a intuitive explanation, consider $f(X_1,\ldots,X_n)=\sum_{i=1}^nX_i$ where $X_i$'s are independent random variables.And let $A=\{X\,|\,f(X)\leq M[f]\}$, where $X=(X_1,\ldots,X_n)$, and $M[f]$ represents the median of $f$. Then from Talagrand's inequality we can get the following results: $$ \Pr[A]\cdot\Pr[X|f(X)>M[f]+t]\leq e^{-t^2/(4n)}\, . $$ Then from the definition of median we know $\Pr[A]\geq 1/2$, so finally we have $$ \Pr[X|f(X)>M[f]+t]\leq 2e^{-t^2/(4n)}\, , $$ a concentration on the median result. I have omitted many details in the proof and you may refer to McDiarmid's nice survey for a detailed proof.


Suppose that your sample has $k$ elements, that $M$ is your median, and that in your distribution a single element exceeds $M+t$ with probability $p_t$ (dependent on $t$).

Then the probability that your median is at least $M+t$ is the probability that $k$ independent Bernoulli trials, each with success probability $p_t$, have among them at least $(k+1)/2$ successes (if $k$ is odd. If $k$ is even you can replace $(k+1)/2$ by $k/2$ to get an upper bound on the probability). You can then use concentration inequalities for the Binomial distribution (e.g. the Chernoff bound) to bound the probability this occurs in terms of $p_t$.