Is Liouville's equation an axiom of classical statistical mechanics?

Why There is a Need for a Further Axiom

To derive Liouville's equation, you indeed need another axiom further to your assumptions. Something like: "there is no nett creation or destruction of any particle of any species throughout the particle system state evolution". The easiest way to understand the need for this axiom is to cite a system wherefor Liouville's equation cannot hold, even though particles undergo dynamical evolution described by Hamilton's equations throughout their lifetimes: a system of particles undergoing a far-from-equilibrium chemical reaction. In such a system, reactant particle species are consumed by the reaction, and disappear from phase space. Reaction product particles appear in phase space in their place. Moreover, chemical energy is converted to kinetic energy (or contrariwise), so that a product species will "suddenly" appear at a different point in phase space from the one where the correspondingly consumed reactant species particles vanished. Liouville's equations would conceptually be replaced by a coupled system of equations, one for each species $j$, of the form:

$$\frac{\partial\,\rho_j(X,\,t)}{\partial\,t} = \{H,\,\rho_j\} + \sum\limits_k \int_\mathcal{P} M_{j\,k}(X,\,X^\prime)\,\rho_k(X^\prime,\,t)\,\mathrm{d}\Gamma^\prime$$

where the integral is over all phase space $\mathcal{P}$, $\Gamma$ is the measure defined by the volume form and the kernel $M_{j\,k}$ expresses detailed stochimetric balance between the chemically reacting species as well as other physical principles such as conservation of energy, momentum and strict increase with time of entropy. Note that I said "nett" creation or destruction: Liouville's equation would work if the reaction were at equilibrium.

Complete Axioms

The following axioms (1. and 2. are equivalent to yours) will get you Liouville's equation:

  1. Axiom 1: Phase space is a $2\,N$ dimensional $C^2$ manifold $\mathcal{P}$;
  2. Axiom 2: Points in phase space always and only evolve with a flow parameter $t$ through Hamilton's equations defined by a $C^2$ Hamiltonian $H:\mathcal{P}\times \mathbb{R}\to\mathbb{R}$, the latter possibly time varying (hence the domain $\mathcal{P}\times\mathbb{R}$);
  3. Axiom 3: The full states of particles are points in $\mathcal{P}$ evolving according to axiom 2 and there is no nett creation or destruction of any particle of any species throughout the particle system state evolution.

From Complete Axioms to Liouville's Equation

From these axioms, the chain of inference you need is as follows:

  1. Inference 1: From axioms 1. and 2., deduce that any $X\in T_p\,\mathcal{P};\;\forall p\in \mathcal{P}$ expressed in canonical co-ordinates (i.e. ones for which the Hamilton equations hold) that is Lie-dragged by the Hamiltonian flow evolves according to $\dot{X} = A(t)\,X$ where $A(t)\in\mathfrak{sp}(N,\,\mathbb{R})$, thus the symplectic 2-form $\omega(X,\,Y)\stackrel{def}{=} X^T\,\Omega\,Y$ where, for the special case of canonical co-ordinates, $\Omega =\left(\begin{array}{cc}0&-1_N\\1_N&0\end{array}\right)\;\forall p\in\mathcal{P}$ is conserved under the mapping $\mathcal{P}\mapsto \Phi(H,\,t)\,\mathcal{P},\,\forall t\in\mathbb{R}$ induced by the Hamiltonian flow. (Indeed, at any given point $p\in\mathcal{P}$ find $N$ different $C^2$ Hamiltonians such that the tangents to their flows span $T_p\,\mathcal{P}$ to deduce that the Lie derivative of $\omega$ in any direction is nought, thus $\mathrm{d}\omega=0$ from Cartan's formula relating Lie and Exterior derivatives, but this information is further to our immediate needs). Take heed that inference 1 holds whether or not the Hamiltonian be time-varying. In the latter case, the Hamiltonian is not constant along the flow, but the flow still conserves the symplectic form.
  2. Inference 2: From inference 1, we have immediately that the volume form $\Gamma = \omega^N$ ($N^{th}$ exterior power) is conserved under Hamiltonian flows. Thus deduce Liouville's theorem (as opposed to equation). Alternatively, the conservation of the symplectic form shown in Inference 1 implies that the Jacobi matrix of the transformation $\mathcal{P}\mapsto \Phi(H,\,t)\mathcal{P}$ is a symplectic matrix (member of $\mathrm{Sp}(N,\,\mathbb{R})$), which always has a unit determinant. Thus the volume form is conserved.
  3. Inference 3: But the volume form is also the Jacobian of the transformation $\mathcal{P}\mapsto \Phi(H,\,t)\,\mathcal{P}$ and $J(p,\,\Phi(H,\,0))=J(p,\,\mathrm{id})=1$. Since the volume form is conserved, the Jacobian $J(p,\,\Phi(H,\,t))=1,\forall p\in\mathcal{P},\forall t\in \mathbb{R}$. Thus $\Phi$ is everywhere a local bijection (inverse function theorem). Alternatively, we can make the same deduction this straight from Inference 1, which implies that the Jacobi matrix of the transformation $\mathcal{P}\mapsto \Phi(H,\,t)\mathcal{P}$ is a symplectic matrix (member of $SP(N,\,\mathbb{R})$), which is never singular and indeed always has a unit determinant.
  4. Inference 4: From Axioms 2 and 1, deduce that the distance function defined in canonical co-ordinates by $d(p_1,\,p_2) = (p_1-p_2)^T\,(p_1-p_2)$, zero iff $p_1=p_2$, between any pair of points $p_1,\,p_2\in\mathcal{P}$ must be a continuous function of the flow parameter $t$ (continuous with respect to the topology with basis of open balls defined by this distance function);
  5. Inference 5: From Inference 3, at any $p\in\mathcal{P}$ and $t\in \mathbb{R}$, there is an open set $\mathcal{U}_p$ small enough such that $\Phi(H,\,t):\mathcal{U}_p\to\Phi(H,\,t)(\mathcal{U}_p)$ is a bijection. The question now arises as to whether $\Phi(H,\,t)$ can map any point outside $\mathcal{U}_p$ into $\Phi(H,\,t)(\mathcal{U}_p)$ (which situation would make $\Phi(H,\,t)$ a local bijection but globally many to one for some $t\in\mathbb{R}$). However, if two or more points are mapped to one point $\tilde{p}\in\Phi(H,\,t)(\mathcal{U}_p)$, from inference 4. deduce that $\exists t$ small enough that the two chosen preimages of $\tilde{p}$ both lie in $\mathcal{U}_p$, thus contradicting local bijectivity. (Informally, from inference 4, multiple points of a function can only arise from mappings along connected "forked" flow lines, so zoom in near enough in on the fork point and thus contradict local bijectivity, showing that forks are impossible). Repeating the reasoning for $-t$ lets us deduce that multiple points are impossible and $\Phi(H,\,t):\mathcal{P}\to\mathcal{P}$ is a global bijection (indeed a symplectomophism in the light of inference 1, but, again, this information is further to our needs);
  6. Inference 6: From inference 5 and axiom 3, deduce that if there is some number $M$ of particles in any subset $\mathcal{V}\subseteq\mathcal{P}$, then there are precisely $M$ particles in $\Phi(H,\,t)(\mathcal{V})$. From inference 2. deduce that $\mathcal{V}$ and $\Phi(H,\,t)(\mathcal{V})$ have the same volumes. Therefore infer that the average particle density in any subset $\mathcal{V}\subseteq\mathcal{P}$ is constant if the particle states and subsets evolve by Hamiltonian flows;
  7. Inference 7: Apply inference 6 to a small open set that is shrunken according to an appropriate limiting process to deduce that the density function $\rho(p,\,t)$ at point $p$ and at time $t$ must be the same as the density at point $\Phi(H,\,-\mathrm{d}t)\,p$ at time $t-\mathrm{d}t$. Putting these words into symbols: $\mathcal{L}_{-X}\rho=\frac{\partial\,\rho}{\partial\,t}$, where $X$ is the vector field tangent to the Hamiltonian flow $\Phi$. This is, of course, $\{H,\,\rho\}=\frac{\partial\,\rho}{\partial\,t}$, or Liouville's equation.

Circularity of Other Proofs

Ultimately, I don't believe that proofs of Liouville's equation grounded on the divergence theorem are different from the above: I think that they are tacitly introducing Axiom 3 as "obvious" (even though I hope I have shown at the beginning of my answer that it doesn't always hold) and then the continuity equation and incompressible flows are simply an expression of this tacitly assumed axiom. So I don't think that these "proofs" are circular, just somewhat badly written in making use of tacit assumptions.

Summary

User Image sums all this up nicely (I was perhaps too brainfried to make the last step):

For Axiom 3 however, you showed that Axiom 3 $\Rightarrow \frac{d \varrho}{d t} = 0$. The other direction $\frac{d \varrho}{d t} = 0 \Rightarrow$ Axiom 3 is readily discussed in any textbook (trajectories do not start, end or cross etc.). So in fact we have Axiom 3 $\Leftrightarrow \frac{d \varrho}{d t} = 0$ when we are in the context of Axiom 1+2, e.g. classical mechanics. Hence, Liouville's equation is an axiom.

and indeed, in the presence of the other two, my axiom 3 is logically equivalent to Liouville's equation. My version is perhaps more physically transparent, but open to interpretation, and so the assertion of Liouville's equation as an axiom is perhaps more succinct and precise. So the answer to the title question is that Liouville's Equation must indeed be added as an axiom, and, in the presence of Axioms 1 and 2, it has the meaning that particle number of all species is conserved.


The probability for the system to be in the phase cell $d\Gamma(t)$ at time $t$ is $$ P(t)=\rho(q,p,t)d\Gamma(t). $$ The time evolution of the trajectories and the possible explicit time dependence of $\rho$ is also considered into this. Now infinitesimal $dt$ time later $$ P(t+dt)=\rho(q+dq,p+dp,t+dt)d\Gamma(t+dt), $$ because probability is normed (eg. if we let the phase volume flow with the time evolution it should not change), these two must be equal, but by Liouville's theorem we have $d\Gamma(t)=d\Gamma(t+dt)$, the $\rho$ expressions must agree, so we have $$ 0=\rho(q+dq,p+dp,t+dt)-\rho(q,p,t)=\frac{\partial\rho}{\partial q}dq+\frac{\partial\rho}{\partial p}dp+\frac{\partial\rho}{\partial t}dt, $$ from which we have (by "dividing" by $dt$) $$ \frac{d\rho}{dt}=\frac{\partial\rho}{\partial q}\dot{q}+\frac{\partial\rho}{\partial p}\dot{p}+\frac{\partial\rho}{\partial t}=0,$$ which is equal to $$ \{H,\rho\}=\frac{\partial\rho}{\partial t}, $$ which is Liouville's equation.

Now, in equilibrium, we posulate $\rho$ to be explicitly time independent (as a definition of equilibrium), then we have $$ \{H,\rho\}=0, $$ so $\rho$ is a constant of motion.


Edit:

To justify the handwave-y step ($P(t)=P(t+dt)$), consider that for the entirety of phase space $\mathcal{P}$, we must have $$ 1=\int_{\mathcal{P}}\rho(q,p,t)d\Gamma(t). $$ By Liouville's theorem, the phase flow preserves $d\Gamma$, so it is time independent. Therefore, for the integral to be time-independent, the integrand must be so.

This is still a bit handwave-y, as the integral happens over phase space, whereas we take our time derive to be "along the time evolution", but I think, this can be further formalised by letting the phase flow act on the entire integral, eg. by first taking $$ 1=\int_{\mathcal{P}}\rho(q,p,t)d\Gamma, $$ and then taking $$ 1=\int_{\Phi_t(\mathcal{P})}\rho\ d\Gamma, $$ and comparing the two (where $\Phi_t$ is the phase flow).


Edit2:

Sorry for the mass edits, but I just checked, and my previous intuition seems correct. Some differential geometry shall be employed here.

The Liouville form is $d\Gamma$ (not that $d$ here is just a notation, not an exterior derivative), which in canonical (Darboux-) coordinates is given by $d\Gamma=dq^1\wedge...\wedge dq^n\wedge dp_1\wedge...\wedge dp_n$.

The phase density itself defines a (possibly) time-dependent $2n$-form as $\rho(q,p,t)d\Gamma$. Let $\Phi_t$ be the phase flow and consider $$ \frac{d}{dt}|_{t=0}\int_{\phi_t(\mathcal{P})}\rho d\Gamma=\frac{d}{dt}\int_{\mathcal{P}}(\Phi_t)^*(\rho d\Gamma)=\int_\mathcal{P}\frac{d}{dt}(\Phi^*_t\rho\Phi^*_td\Gamma)=\int_\mathcal{P}\left(\mathcal{L}_X\rho+\frac{\partial\rho}{\partial t}\right)d\Gamma+\rho\mathcal{L}_Xd\Gamma. $$ Here $X=d\Phi/dt|_{t=0}$ the Hamiltonian vector field, all time derivatives are taken at $t=0$, the partial time derivative appeared because $\rho$ has explicit time dependence too, and the last term is zero by Liouville's theorem.

Because integrals are diffeomorphism-invariant, this derivative must be zero, moreover, this must be true for any probability density $\rho$, hence the integrand itself must vanish, so we have $$ \mathcal{L}_X\rho+\frac{\partial\rho}{\partial t}=0, $$ and $X$ is the same as your $\vec{v}$, so this is actually Liouville's equation.