Discrete entropy of the integer part of a random variable

Using (say) decimal notation, ASCII encoding, and a delimiter symbol such as a space or comma, as well as the law of large numbers, one can almost surely encode $N$ independent copies of $\lfloor X \rfloor$ using $O( N {\bf E} \log( 2 + |X| ) ) + o(N)$ bits. Applying the Shannon source coding theorem, we conclude that

$$ {\bf H}( \lfloor X \rfloor ) \ll {\bf E} \log(2 + |X| )$$

which by Jensen's inequality also gives

$$ {\bf H}( \lfloor X \rfloor ) \ll_p \log(2 + {\bf E} |X|^p)$$

for any $0 < p < \infty$.


$\newcommand{\fx}{\lfloor X\rfloor}$ $\newcommand\Z{\mathbb{Z}}$ We shall prove more than requested: that $H(\fx)<\infty$ if $E\ln(1+|X|)<\infty$.

Indeed, let $$p_n:=P(\fx=n),$$ so that $$H(\fx)=-\sum_{n\in\Z}p_n\ln p_n.$$ Let $q\colon\mathbb R\to(0,\infty)$ be any function such that $$\sum_{n\in\Z}q(n)=1\tag{1}$$ and $$q(x)\le cq(\lfloor x\rfloor)\tag{2}$$ for some real $c>0$ and all real $x$.

Then by the Gibbs inequality for the Kullback–Leibler divergence between $(p_n)_{n\in\Z}$ and $(q(n))_{n\in\Z}$ we have $$0\le KL((p_n)_{n\in\Z}||(q(n))_{n\in\Z})=\sum_{n\in\Z}p_n\ln\frac{p_n}{q(n)}=-H(\fx)+\sum_{n\in\Z}p_n\ln\frac1{q(n)},$$ whence, in view of (2), $$H(\fx)\le\sum_{n\in\Z}p_n\ln\frac1{q(n)} \\ =\sum_{n\in\Z}\int_{[n,n+1)}P(X\in dx)\ln\frac1{q(n)} \\ \le\sum_{n\in\Z}\int_{[n,n+1)}P(X\in dx)\ln\frac c{q(x)} \\ =E\ln\frac c{q(X)}=\ln c+E\ln\frac1{q(X)}.$$ So, $$H(\fx)<\infty\quad\text{if}\quad E\ln\frac1{q(X)}<\infty.$$ Taking here e.g. $q(x)=\frac C{(1+|x|)^2}$, where $C:=1/\sum_{n\in\Z}\frac1{(1+|x|)^2}[=\frac3{\pi ^2-3}]$, we have conditions (1) and (2) satisfied. So, $$H(\lfloor X\rfloor)<\infty\quad\text{if}\quad E\ln(1+|X|)<\infty.$$ It follows that for any real $a>0$ $$H(\lfloor X\rfloor)<\infty\quad\text{if}\quad E|X|^a<\infty,$$ as was initially desired.


Since $\lfloor X\rfloor$ has finite entropy if and only if $|\lfloor X\rfloor|$ has finite entropy, it suffices to consider random variables taking values in the natural numbers. Write $p_n$ for $\mathbb P(X=n)$ (so that $\sum_n p_n=1$). We have $X\in L^q$ if and only if $\sum p_n n^q<\infty$.

Suppose $X\in L^q$ so that $\sum p_n n^q<\infty$. Then let $S_1=\{n\colon p_n<\frac{1}{n^2}\}$ and $S_2=\{n\colon p_n\ge \frac 1{n^2}\}$. We have $$ H(X)=\sum_n -p_n\log p_n=-\sum_{n\in S_1}p_n\log p_n-\sum_{n\in S_2}p_n\log p_n. $$ Since $-t\log t$ is increasing on $[0,\frac 1e]$, the first sum is bounded above by $$ \sum_{n\in S_1}\frac{2\log n}{n^2}<\infty. $$ There exists an $n_0$ so that for $n\ge n_0$, $2\log n<n^q$. For $n\in S_2$ such that $n\ge n_0$, $-\log p_n<2\log n<n^q$, so that $$ -\sum_{n\in S_2,\,n\ge n_0}p_n\log p_n\le \sum_{n\in S_2,\,n\ge n_0}p_n n^q<\infty. $$ Hence $H(X)<\infty$. (This trick appears in a couple of papers of mine: one with Ciprian Demeter in NYJM and another more recent preprint with Tamara Kucherenko and Christian Wolf).