How to see the Phase Space of a Physical System as the Cotangent Bundle
Let's start by answering the first question.
Let $M$ be any manifold. Consider a physical system consisting of a point-particle moving on $M$. What are the configurations of this physical system? The points of $M$. Hence $M$ is the configuration space.
Typically one takes $M$ to be riemannian and we may add a potential function on $M$ in order to define the dynamics. (More complicated dynamics are certainly possible -- this is just the simplest example.)
As an example, let's consider a point particle of mass $m$ moving in $\mathbb{R}^3$ under the influence of a central potential $$V= k/r,$$ where $r$ is the distance from the origin. The configuration space is $M = \mathbb{R}^3\setminus\lbrace 0\rbrace$.
Classical trajectories are curves $x(t)$ in $M$ which satisfy Newton's equation $$m \frac{d^2 x}{dt^2} = \frac{k}{|x|^2}.$$ To write this equation as a first order equation we introduce the velocity $v(t) = \frac{dx}{dt}$. Geometrically $v$ is a vector field (a section of the tangent bundle $TM$) and hence the classical trajectory $(x(t),v(t))$ defines a curve in $TM$ satisfying a first order ODE: $$\frac{d}{dt}(x(t),v(t)) = (v(t), \frac{k}{m|x(t)|^2})$$ This equation can be derived from a variational problem associated to a lagrangian function $L: TM \to \mathbb{R}$ given by $$L(x,v) = \frac12 m v^2 - \frac{k}{|x|}.$$ The fibre derivative of the lagrangian function defines a bundle morphism $TM \to T^*M$: $$(x,v) \mapsto (x,p)$$ where $$p(x,v) = \frac{\partial L}{\partial v}.$$
In this example, $p = mv$. The Legendre transform of the lagrangian function $L$ gives a hamiltonian function $H$ on $T^*M$, which in this example is the total energy of the system: $$H(x,p) = \frac{1}{2m}p^2 + \frac{k}{|x|}.$$
The equations of motion can be recovered as the flow along the hamiltonian vector field associated to $H$ via the standard Poisson brackets in $T^*M$:
$$ \frac{dx}{dt} = \lbrace x,H \rbrace \qquad\mathrm{and}\qquad \frac{dp}{dt} = \lbrace p,H \rbrace.$$
Being integral curves of a vector field, there is a unique classical trajectory through any given point in $T^*M$, hence $T^*M$ is a phase space for the system; that is, a space of states of the physical system. Of course $TM$ is also a space of states, but historically one calls $T^*M$ the phase space of the system with configuration space $M$. (I don't know the history well enough to know why. There are brackets in $TM$ as well and one could equally well work there.)
Not every space of states is a cotangent bundle, of course. One can obtain examples by hamiltonian reduction from cotangent bundles by symmetries which are induced from diffeomorphisms of the configuration space, for instance. Or you could consider systems whose physical trajectories satisfy an ODE of order higher than 2, in which case the cotangent bundle is not the space of states, since you need to know more than just the position and the velocity at a point in order to determine the physical trajectory.
It's late here, so I'll forego answering the bonus question for now.
This is just to add to what the previous posters have said so much better. It is a wonderful "fact" that whatever manifold you choose, there will always be some mechanical system that can be encoded as geodesic flow on this space (relative to some metric). Here are some examples:
$SO(3)$ : rigid body dynamics. The elements of $SO(3)$ are rotation matrices, determining the orientation of the rigid body.
$\mathrm{Diff}_{vol}(M)$, the group of volume-preserving diffeomorphisms of a manifold $M$: incompressible fluid dynamics. A diffeo $\varphi: M \to M$ tells you the following: a fluid particle initially at $X \in M$ ends up at $x = \varphi(X)$.
$\mathrm{Diff}(S^1)$, ordinary diffeos on the circle: geodesic flow is nothing but the Burgers' equation.
$\mathrm{Diff}(S^1) \times \mathbb{R}$, the Bott-Virasoro group (where the multiplication is defined using the Bott-Virasoro cocycle). Geodesic flow here gives you the Korteweg-de Vries equation, which is Burgers' equation with a nonlinearity. The nonlinearity in the equation stems directly from the BV cocycle.
Anyway, the point is that you can have a good time just writing down a manifold with a Riemannian metric and writing down the geodesic equations. More often than not, the resulting system will be out there somewhere in the guise of some famous differential equation, and in this way, you can often relate the geometry of the configuration space to the properties of the equation.
Well, I'll not say anything deeply new here - just a (hopefully correct) summary.
Suppose you are given your favorite manifold, say $Q.$ Then its cotangent bundle $M=T^* Q$ comes equipped with some canonical structure (Keep in mind that the cotangent bundle $T^* Q$ and the tangent bundle $TQ$ are isomorphic as vector bundles over $Q$ (in particular their total spaces are diffeomorphic, but not canonically so), but for some miraculous reason the cotangent bundle has "more" structure). Denote by $\pi:T^* Q\rightarrow Q$ the projection, associating to a covector its basepoint. Differentiating this yields the tangent map of $\pi$, $T\pi: T(T^* Q)\rightarrow TQ$. With its help one can define a one-form $\theta$ on $T^* Q$ (usually called "canonical" or Liouville one-form). It is defined via
$\theta_\alpha:T_\alpha (T^* Q)\rightarrow \mathbb{R},$ $v\mapsto \alpha(d\pi(\alpha).v)$
for any point $\alpha\in T^* Q$. To explain why the definition makes sense: $v$ is an arbitrary element of $T_\alpha (T^* Q)$, the differential of $\pi$ at $\alpha,$ $d\pi(\alpha)$ is a linear map from $T_\alpha (T^* Q)$ to $T_{\pi(\alpha)} Q$. Furthermore $\alpha$ can be interpreted as a linear form on the tangent space of its base point; consequently it makes sense to evaluate it on $d\pi(\alpha).v$.
Now the symplectic form $\omega\in \Omega^2(T^*Q)$ is defined as the exterior differential of $\theta$ (some authors prefer to smuggle a minus sign in). Notice $\omega$ is defined purely intrinsically (no choice of coordinates for instance, even though one often sees expressions like $\theta=p_idq^i$).
That's the reason why you can associate a symplectic manifold (aka phase space) to any manifold (aka configuration space).
But to elaborate on what José Figueroa-O'Farrill already said: there are symplectic manifolds which are not of the form $T^* Q$ for some $Q$. Probably the easiest example are closed symplectic manifolds, i.e. compact ones without boundary (they occur for instance as surfaces with a fixed volume form or as nonsingular complex projective varieties together with the Fubini-Study form). You can easily show that their symplectic form can not be exact (that is, of the form $d\theta$ for some one-form $\theta$) unless they are zero-dimensional. Because if it were exact, by Stokes' theorem the integral $\int \omega^{\wedge (dim M/2)}$ would have to be zero. And it is difficult to find manifolds with zero total volume!
In order to answer the second question: the Lie algebra structure on $\mathfrak{g}=Lie(G)$ induces a Poisson structure on its dual $\mathfrak{g}^*,$ as for instance explained in http://en.wikipedia.org/wiki/Poisson_manifold#Example.