Motivation and use for category theory?
I myself find pure category theory for its own sake rather difficult to swallow, and prefer to think of it with actual examples of its use. So, let me give a few examples (historical) of how the abstraction of category theory led to significant mathematical advances.
There is the theory of etale cohomology. Etale cohomology is a variant of the standard sheaf cohomology one encounters in algebraic geometry courses.* The starting point is a categorical observation: the sheaf axioms are, fundamentally, functorial; a sheaf on a topological space $X$ is a contravariant functor from the category of open sets of $X$ (with the morphisms the inclusions) that satisfies a certain exactness property. When interpreted in this way, it is possible to talk about sheaves on a general category with a suitable notion of covering (i.e., a Grothendieck topology). If one uses different categories (for instance, the Zariski site gives regular sheaf cohomology, but the etale site gives etale cohomology) one can get different cohomology theories. (Incidentally, as another example, in the theory of the etale fundamental group, Grothendieck developed an abstract approach to Galois theory that not only clarifies the analogy between Galois theory and the classification of covering spaces, but allows one to construct purely categorically an algebraic $\pi_1$.)
In homotopy theory, Quillen's language of model categories unified the ideas behind the homotopy theory of simplicial sets and the homotopy theory of topological spaces. In other words, to do homotopy theory in this language, one simply needs a category with suitable structure on it (maps designated as cofibrations, fibrations, and weak equivalences; these are supposed to abstract the notions of Serre cofibration, fibration, and weak homotopy equivalence and satisfy lifting properties), and from this alone one can construct the homotopy category. Doing so allowed Quillen to efficiently find new examples of model categories, which one might not immediately associate with "homotopy theory," such as the model category of simplicial commutative rings; with this, and an abstract definition of homology, he was able to construct the so-called cotangent complex and thus the Andre-Quillen cohomology of a ring (which had been conjectured by Grothendieck).
Simplicial sets themselves can be viewed purely combinatorially: they are a sequence of sets $X_n$ with suitable boundary and degeneracy maps, and this is all one needs. But for a human, this sequence of notation is somewhat formidable and un-intuitive; it is much cleaner to use the language of categories and say that they are (contravariant) functors from the category of finite ordered sets to the category of sets. This allows one to easily construct things like the standard $n$-simplex $\Delta[n]$ and see its universal property (because it is just a consequence of general categorical nonsense, Yoneda's lemma). One benefit of thinking in a categorical manner is that, although I know very little about this, there is actually a general theory (apparently developed by Cisinski) of constructing model structures on presheaf categories.
In mathematics, it frequently happens that an object will parametrize a family of things in some way. For instance, the Hilbert scheme parametrizes closed subschemes of a projective scheme, while projective space itself parametrizes line bundles together with a set of generators; there are numerous more examples. In each, it is a little tricky state exactly what "parametrizes" really means: the elegant approach is to say that some given functor is representable. In other words, it is to say that some functor $F$ can be realized as maps into some object $X$, which is the "universal" parametrizing object. It is often of interest to give some specific criteria for a general functor to be representable (and herein is the essence of the categorical approach; proving representability individually for one concrete functor is a task that could, a priori, be formulated without appeal to category theory). In algebraic topology, a rather spectacular result (the Brown representability theorem) states that anything that looks kind of like cohomology (in particular, any extraordinary cohomology theory) is representable on the homotopy category, at least if you stick to CW complexes. This is really a sweeping result because it applies to a very large class of functors.**
(In algebraic geometry, I am not aware of any such strong sufficiency conditions. On the other hand, there are fairly stringent necessary conditions that any representable functor on the category of schemes must satisfy---such functors must be sheaves in suitable Grothendieck topologies (cf. 1 above). This in practice is a type of descent condition.)
*I think one reasonably argue that even the introduction of sheaf cohomology was a revolution of the categorical approach: sheaf cohomology is (most generally) defined as a derived functor on the category of sheaves, but a derived functor on an abelian category, not something which is obviously a category of modules. (The notion of deriving functors in an abelian category was, if I am not mistaken, introduced in Grothendieck's Tohoku paper.)
**One interesting application of this is to the case of singular cohomology itself. The implication is that if $X$ is a CW complex, then there is a fixed space $K(G, n)$ (for each abelian group $G$ and $n \in \mathbb{Z}$) such that homotopy classes of maps $X \to K(G, n)$ are naturally in bijection with cohomology classes in $H^n(X, G)$. From this it follows that $K(G, n)$ can have only one nonvanishing homotopy group, and one gets a consequence of this categorical nonsense the Eilenberg-Maclane spaces. (In fairness, I should probably point out that, for instance, Hatcher's construction of the Eilenberg-Maclane spaces is basically a toy analog of the proof of Brown representability.)
Finally, one major advantage of the categorical philosophy (which I have already hinted at) is that it allows one to reuse ideas. Some ideas, like Yoneda's lemma or the idea of a universal property, take a little while to digest, but they show up so amazingly often, across diverse mathematical disciplines, that it's just more efficient to prove it once in maximal generality than re-doing a special case of it over and over. Perhaps one reason for this is that so many of the constructions one encounters in mathematics (the tangent bundle to a smooth manifold, the singular (co)homology or homotopy groups of a topological space, the tensor product of modules (or rings), the operation of base-change in algebraic geometry) are ultimately functors.
I had the idea when I was in grad school that category theory was a convenient language in which to state things, but didn't have any deep results. However, I now believe nothing could be further from the truth! Category theory has now shown itself to be incredibly useful in topology for very concrete problems, such as finding knot invariants and even 3 and 4-manifold invariants. Reshetikhin-Turaev invariants, the Kontsevich integral, and Khovanov homology are powerful link invariants that all arise via a categorical approach.