Mathematical difference between entropy and energy
First, note that entropy is well-defined only if $u$ is positive. Using integration by parts, it is in fact true on the flat torus or $\mathbb{R}^n$ (if $u$ is nonnegative and decays in space fast enough) that $$ \frac{\partial}{\partial t}\frac{1}{p(p-1)}\log \int u^p = -\frac{\int u^{p-2}|\nabla u|^2}{\int u^p}\le 0 $$ for all $p > 1$. The case $p = 2$ is the usual $L^2$ energy inequality. If you normalize $u$ so that $$ \int u = 1 $$ and take the limit $p \rightarrow 1$, you get the entropy inequality. So there are $L^p$ analogues of energy or entropy, depending on your point of view.
In information theory, if $u$ is a probability density, then $$ -\int u\log u $$ is called Shannon entropy and $$ \frac{1}{1-p}\log\int u^p $$ is called Rényi entropy.
I would personnally say that physics is all what lies behind the names. One slight difference between them is that, at least in the field of PDEs, whatever we call an entropy is always monotone along the flow of the equation (and usually made positive decreasing up to some constants in its definition) whereas an energy has a broader meaning ; one sometimes hears terms like "$H^s$ energy", "higher [order] energy", "$L^p$ energy [estimate]" and so on. Those quantities do not necessarily decrease ; instead, they may very well increase, as do the $H^s$ energies for various instances of the NLS (non linear Schrödinger) equation. They could exhibit some weirder behaviour, like the usual $L^2$ energy for the Euler equation (I'm talking here about the work of De Lellis and Székelyhidi).
It is probably worth mentioning that the the heat equation is inherently linked to energy and entropy in two ways:
The heat equation is the gradient flow of the energy in the Hilbert space $L^2$ (classical).
The heat equation is the "gradient flow of the entropy in Wasserstein space $W_2$" (Jordan-Kinderlehrer-Otto, also Ambrosio-Gigli-Savare).