The concept of ensemble

I believe it is easier to get the general idea if we first contemplate the problem it solves.

  • Let's begin by saying we're interested in tracking the trajectories of a multitude of particles and see if we're able to tell something about them. We will need time averages, since trajectories are functions of time;
  • We notice our particle number is so large that we're pretty much unable to deal with anything that involves trajectories, initial conditions, etc;
  • We now focus on the fact that our system follows the laws of mechanics and there are some interesting theorems that might help us a lot, like Poincaré's recurrence theorem and the conservation of energy;
  • Keeping recurrence in mind we postulate, on physical grounds, that the multitude of orbits we couldn't track before are now so incredibly complicated that their complexity happens to help: we say they will recur and that each and every region accessible to them will be filled;
  • Since each and every region is, according to our assumption, accessible, we no longer need time averages. We can use space averages instead. This clearly emphasizes need for the notion of some space where each point is a possible configuration of our system in phase space, that is, a probability space formed over our initial phase space;

I will now define such a probability space in a manner that looks proper to me. We will start defining $M$ as the even dimensional manifold of our system, which mathematically is a symplectic dynamical system $D=(M,\omega,T_n)$, where $\omega$ is the symplectic 2-form and $T_t$ is our dynamics, that is, the law which dictates the behaviour of each of our particles as a function of time. Since this law was rendered useless I'm not at all concerned with it, but we must remember that we tacitly assumed this law was weakly mixing or at least ergodic when we substituted time averages for space averages... Fortunately, for Hamiltonians systems this is true: all the energy surface will be densely filled with trajectories.

Now, let us take our symplectic dynamical system $D$ and imagine all possible configurations it might access, as described before. We do this by creating a power set $C(M)$ of all possible phase space states we can find, and I claim that

  1. If one configuration is present in this set, then all other configurations complementary to it are, too (of course, because we assume all configurations are possible);
  2. The countable union of configurations is still a configuration, since they are all allowed (I implicitly used the fact that we are considering an infinite number of particles here);

As the last step, since I'm interested in integration, I do some analysis and notice that providing a Lebesgue measure $\mu$ to this space makes sense. We have thus created the ensemble $(M,C(M),\mu)$, which was built upon the notion of a measure space, $M$ being the topological space, $C(M)$ a $\sigma$-algebra and $\mu$ a (finite) Lebesgue measure over it, which can be turned into a probability measure.

I emphasize this is not rigorous. There are flaws and bypasses I took a mathematician would call "cheating", but I'm not a mathematician. Thinking on these terms has helped me a lot to understand the foundations of Statistical Mechanics. Also, you didn't find this clearly exposed nowhere else because physicist usually don't care about and mathematicians usually don't use it: they prefer using a formalism that applies central limit theorem instead (where everything is indeed much clearer). For a glimpse, check Khintchin's book.