Entropy Lower Bound in Terms of $\ell_2$ norm

Defining $p_i=1/n+q_i$ we get (using nats):

$$\begin{align} H({\bf p}) &=-\sum p_i \log(p_i)\\ &=-\sum (1/n +q_i) \log(1/n +q_i)\\ &=-\sum ( 1/n +q_i) [\log(1/n ) + \log(1+ n q_i )]\\ &= \log(n) -\sum ( 1/n +q_i) \log(1+ n q_i)\\ &\ge \log(n) -\sum ( 1/n +q_i) n q_i\\ & = \log(n) - n\sum q_i^2\\ & = \log(n) - n \, \lVert{\bf p}- 1/n\rVert^2_2\\ \end{align}$$

Or, if you prefer

$$ H({\bf p}) \ge \log(n)\left(1 - \frac{n}{\log n}\sum q_i^2 \right) $$

Of course, the bound is useless if $\sum q_i^2\ge \frac{\log(n)}{n} $.


Let $(X,\upsilon)$ be a finite measure space. Let $\sigma=\frac{\upsilon}{V}$ be the uniform probability distribution on $X$ ($V=\upsilon(X)$). Let $\rho$ be an absolutely continuous probability distribution with density $p$. Then the inequality $$\begin{align}-h(p)+\ln V&\leq V\lVert p-\tfrac{1}{V}\rVert_2^2\\ h(p)&\stackrel{\mathrm{def}}{=}-\int_X p(x)\ln p(x)\mathrm{d}\upsilon_{x}\\ \lVert p-q\rVert^2_2&\stackrel{\mathrm{def}}{=}\int_X\lvert p(x)-q(x)\rvert^2\mathrm{d}\upsilon_x \end{align}$$ is exactly the inequality between the $\chi^2$-divergence and the KL divergence $$\begin{align}D(\rho\parallel\sigma)&\leq \chi^2(\rho\parallel\sigma)\text{,}\\ D(\rho\parallel\sigma)&\stackrel{\text{def}}{=}\int_X\left(\frac{\mathrm{d}\rho}{\mathrm{d}\sigma}\ln\frac{\mathrm{d}\rho}{\mathrm{d}\sigma}-\frac{\mathrm{d}\rho}{\mathrm{d}\sigma}+1\right)\mathrm{d}\sigma_x\\ \chi^2(\rho\parallel\sigma)&\stackrel{\text{def}}{=}\int_X\left(\frac{\mathrm{d}\rho}{\mathrm{d}\sigma}-1\right)^2\mathrm{d}\sigma_x\text{;} \end{align}$$ this inequality in turn follows from $$t\ln t - t + 1 \leq (t-1)^2\text{.}$$


To complement Leon Bloy's answer and show the bound he (and K B Dave) obtained cannot be significantly improved upon (i.e., that $c_n = \Omega\!\left(\frac{n}{\log n}\right)$ is necessary): Fix $\varepsilon \in (0,1]$, and assume without loss of generality that $n=2m$ is even. Define $\mathbf{p}^{(\varepsilon)}$ as the probability mass function (over $\{1,\dots,n\}$) such that $$ \mathbf{p}^{(\varepsilon)}_i = \begin{cases} \frac{1+\varepsilon}{n} & \text{ if } i \leq m \\ \frac{1-\varepsilon}{n} & \text{ if } i > m \end{cases} $$ Note that $$\begin{align} H(\mathbf{p}^{(\varepsilon)}) &= \sum_{i=1}^m \frac{1+\varepsilon}{n}\log \frac{n}{1+\varepsilon} + \sum_{i=m+1}^n \frac{1-\varepsilon}{n}\log \frac{n}{1-\varepsilon} \\ &= \log n - \frac{1}{2}\left((1+\varepsilon)\log(1+\varepsilon) + (1-\varepsilon)\log(1-\varepsilon)\right) \\ &= \log n - \frac{\varepsilon^2}{2} + o(\varepsilon^3) \tag{1} \end{align}$$ while $$ \lVert \mathbf{p}^{(\varepsilon)} - \mathbf{u}_n\rVert_2^2 = \frac{\varepsilon^2}{n} \tag{2} $$ so that $$ H(\mathbf{p}^{(\varepsilon)}) = \log n \left(1 - \left(1/2+o_\varepsilon(1)\right)\cdot\frac{n}{\log n}\lVert \mathbf{p}^{(\varepsilon)} - \mathbf{u}_n\rVert_2^2\right) \tag{3} $$

If you want to avoid the asymptotics as $\varepsilon \to 0$, you can still fix $\varepsilon = 1$ (for instance) and get that $$H(\mathbf{p}^{(\varepsilon)}) = \log n \left(1 - c'_\varepsilon\frac{n}{\log n}\lVert \mathbf{p}^{(\varepsilon)} - \mathbf{u}_n\rVert_2^2\right)$$ for some constant $c'_\varepsilon > 0$.