Is log the only choice for measuring information?

We want to classify all continuous(!) functions $I\colon(0,1]\to\Bbb R$ with $I(xy)=I(x)+I(y)$. If $I$ is such a function, we can define the (also continouus) function $f\colon[0,\infty)\to \Bbb R$ given by $f(x)=I(e^{-x})$ (using that $x\ge 0$ implies $e^{-x}\in(0,1]$). Then for $f$ we have the functional equation $$f(x+y)=I(e^{-(x+y)})=I(e^{-x}e^{-y})=I(e^{-x})+I(e^{-y})=f(x)+f(y).$$ Let $$ S:=\{\,a\in[0,\infty)\mid \forall x\in[0,\infty)\colon f(ax)=af(x)\,\}.$$ Then trivially $1\in S$. Also, $f(0+0)=f(0)+f(0)$ implies $f(0)=0$ and so $0\in S$. By the functional equation, $S$ is closed under addition: If $a,a'\in S$ then for all $x\ge 0$, we have $$f((a+a')x)=f(ax+a'x)=f(ax)+f(a'x)=af(x)+a'f(x)=(a+a')f(x)$$ and so als $a+a'\in S$.

Using this we show by induction that $\Bbb N\subseteq S$: We have $1\in S$; and if $n\in S$ then also $n+1\in S$ (because $1\in S$).

Next note that if $a,b\in S$ with $b>0$ then for all $x\ge0$ we have $f(a\frac xb)=af(\frac xb)$ and $f(x)=f(b\frac xb)=bf(\frac xb)$, i.e., $f(\frac ab x)=\frac abf(x)$ and so $\frac ab\in S$. As $\Bbb N\subseteq S$, this implies that $S$ contains all positive rationals, $\Bbb Q_{>0}\subseteq S$.

In particular, if we let $c:=f(1)$, then $f(x)=cx$ for all $x\in \Bbb Q_{>0}$. As we wanted continuous functions, it follows that $f(x)=cx$ for all $x\in[0,\infty)$. Then $$ I(x)=f(-\ln x)=-c\ln x.$$

Remark: The request for continuity of $I$ (and hence $f$) is of course reasonable in the given context. But it turns out that much milder restrictons on $f$ suffice to enforce the result as found. It is only without any such restrictions that the Axiom of Choice supplies us with highly non-continuous additional solutions to the functional equation. The original remark that the logs fits what we expect and are easy to work with is quite an understatement if one even thinks of considering these non-continuous solutions.


I just wanted to point something out, but honestly, I think the other answers are far better given that this is a mathematics site. I'm just pointing it out to add another argument for why logarithm makes sense as the only choice.

You have to ask yourself what information even is. What is information?

Information is the ability to distinguish possibilities.1

1 Compare with energy in physics: the ability to do work or produce heat.

Okay, let's start reasoning.

Every bit (= binary digit) of information can (by definition) distinguish 2 possibilities, because it can have 2 different values. Similarly, every n bits of information can distinguish $2^n$ possibilities.

Therefore: the amount of information required to distinguish $2^n$ possibilities is $n$ bits.
And the same exact reasoning works regardless of whether you're talking about base 2 or 3 or e.
So clearly you have to take a logarithm if the number of possibilities is an integer power of the base.

Now, what if the number of possibilities is not a power of $b = 2$ (or whatever your base is)?
In this case you're looking for a function that coincides with the logarithm at the integer powers.

At this point, I would be convinced to use the logarithm itself (anything else would seem bizarre), but this is where a mathematician would invoke the reasonings mentioned in the other arguments (continuity or additivity for independent events or whatever) to show that no other function could satisfy reasonable criteria on information content.