Why doesn't a Taylor series converge always?
It is rather unfortunate that in calc II we teach Taylor series at the same time as we teach Taylor polynomials, all the while not doing a very good job of stressing the distinction between an infinite series and a finite sum. In the process we seem to teach students that Taylor series are a much more powerful tool than they are, and that Taylor polynomials are a much less powerful tool than they are.
The main idea is really the finite Taylor polynomial. The Taylor series is just a limit of these polynomials, as the degree tends to infinity. The Taylor polynomial is an approximation to the function based on its value and a certain number of derivatives at a certain point. The remainder formulae tell us about the error in this approximation. In particular they tell us that higher degree polynomials provide better local approximations to a function.
But the issue is that what "local" means really depends on the degree $n$. Looking at the Lagrange remainder, the error in an approximation of degree $n$ is
$$\frac{f^{(n+1)}(\xi_n) (x-x_0)^{n+1}}{(n+1)!}$$
where $\xi_n$ is between $x_0$ and $x$. So the ratio of the errors between step $n$ and step $n-1$ is*
$$\frac{f^{(n+1)}(\xi_n) (x-x_0)}{f^{(n)}(\xi_{n-1}) (n+1)}$$
where similarly $\xi_{n-1}$ is between $x$ and $x_0$. So the error is smaller where this quantity is less than $1$. From this form we can see that we can choose $x$ close enough to $x_0$ to guarantee that this quantity is less than $1$. But if
$$\frac{f^{(n+1})(\xi_n)}{f^{(n)}(\xi_{n-1})} > \frac{n+1}{x-x_0}$$
then the approximation of degree $n$ will be worse than the approximation of degree $n-1$ at $x$. If this keeps happening over and over, then there is no hope of the Taylor series converging to the original function anywhere except the point of expansion. In other words, if the derivatives grow way too fast with $n$, then Taylor expansion has no hope of being successful, even when the derivatives needed exist and are continuous.
*Here I am technically assuming that $f^{(n)}(\xi_{n-1}) \neq 0$. This assumption can fail even when $f$ is not a polynomial; think of the linear approximation of $\sin$ at $\pi/2$. But this is a "degenerate" situation in some sense.
One of the intuitive reasons is that working with functions of real argument we do not care about their singularities in the complex plane. However these do restrict the domain of convergence.
The simplest example is the function $$f(x)=\frac{1}{1+x^2},$$ which can be expanded into Taylor series around $x=0$. The radius of convergence of this series is equal to $1$ because of the poles $x=\pm i$ of $f$ in the complex plane of $x$.
The Taylor expansion is not derived from the mean value theorem. Taylor's expansion is a definition valid for any function which is infinitely differentiable at a point. The various forms for the remainder are derived in various ways. By definition, the remainder function is $R(x)=f(x) - T(x)$ where $f$ is the given function and $T$ is its Taylor expansion (about some point). There is no a-priori guarantee that the Taylor expansion gives any value remotely related to the value of the function, other than at the point of expansion. The various forms for the remainder may be used to obtain bounds on the error which in turn can be used to show convergence at some region, but there is no a-priori reason to expect well-behaved bounds. You simply have a formula for the remainder. The remainder may still be large. Of course, the existence of well-known examples where the Taylor expansion really does not approximate the function at all show that it is hopeless to expect miracles.