Proof of Krylov-Bogoliubov Theorem
There are two pretty simple proofs. Both rely on studying the action $T_* \colon \mathcal{M} \to \mathcal{M}$, where $\mathcal{M}$ is the space of Borel probability measures on $X$ and the action is given by $(T_* \mu)(E) := \mu(T^{-1}(E))$. A measure $\mu$ is $T$-invariant if and only if $T_* \mu = \mu$.
One proof is the one given by Michael Coffey in his answer: start with any measure $\mu$, not necessarily invariant, such as the $\delta$-measure sitting at an arbitrary point, and then consider the sequence of measures $\mu_n = \frac 1n \sum_{k=0}^{n-1} T^k_* \mu$. Because $\mathcal{M}$ is weak* compact, some subsequence $\mu_{n_j}$ converges to a measure $\nu\in \mathcal{M}$, and it's not hard to show that $\nu$ is invariant.
An alternate proof is to observe that $\mathcal{M}$ is a compact convex subset of the locally convex vector space $C(X)^* $, and that $T_* $ acts continuously on $\mathcal{M}$, whence by the Schauder-Tychonoff fixed point theorem it has a fixed point $\nu=T_* \nu$.
First, fix $x \in X$ and let $\mu_1 := \delta_x$ be the Dirac measure supported at $x$. Then define a sequence of probability measures $\mu_n$ such that for any $f \in C^0 (X)$, $$ \int_X f(y) \mathrm{d} \mu_n (y) = \frac{1}{n} \sum_{k=0}^{n-1} \int_X f \circ T^k (y) \mathrm{d} \mu_1 (y). $$ Apply the Banach-Alaouglu Theorem to deduce there exists a subsequence $\mu_{n_j}$ which converges in the weak-$\star$ topology. It is then very easy to prove that this limit measure is in fact T-invariant, using the formulation that $\mu$ is T-invariant if and only if $$\int_X f \circ T \mathrm{d} \mu = \int_X f \mathrm{d}\mu$$ for all continuous $f$.
In addition to the excellent answers above, I also suggest the nice survey Oxtoby, Ergodic Sets.
Introduction. Ergodic sets were introduced by Kryloff and Bogoliouboff in 1937 in connection with their study of compact dynamical systems [16]. The purpose of this paper is to review some of the work that has since been done on the theory that centers around this notion, and to present a number of supplementary remarks, applications, and simplifications. For simplicity were shall confine attention to systems with a discrete time. Continuous flows present no difficulty, but the development of a corresponding theory for general transformation groups is still in an incomplete stage. An example due to Kolmogoroff (see [5]) shows that such an extension cannot be made without sacrificing either the invariance or the disjointness of ergodic sets.
In §§1 and 2 we give a brief, but self-sufficient, development of the basic theorems of Kryloff and Bogoliouboff. In §3 we collect some auxiliary results for later use. In §4 a simple characterization of transitive points is obtained. In §5 the distinctive properties of some special types of systems and subsystems are discussed, and in §6 these results are used to discover conditions under which the ergodic theorem holds uniformly. In §7 a generalization to noncompact systems is considered, and in §§8 and 9 some known representation theorems are obtained as an application of ergodic sets. In §10 there is given an example of a minimal set that is not strictly ergodic, similar to one constructed by Markoff.