What's the intuition with partitions of unity?
In a few words, the point of partitions of unity is to take functions (or differential forms or vector fields or tensor fields, in general) that are locally defined, bump them off so they're smoothly $0$ outside their domain of definition, and then add them all up to get something globally defined.
For example, suppose you have a surface $S$ in $\mathbb R^3$ that you can locally write as $f=0$, but you don't know how to do so globally. You can cover $S$ with open sets $U_i\subset\mathbb R^3$ on which you have smooth functions $f_i\colon U_i\to\mathbb R$ with $S\cap U_i = \{x\in U_i: f_i(x)=0\}$. Consider $\Phi = \{\phi_i\}$, where $\phi_i$ is supported in $U_i$. Then $f=\sum \phi_if_i$ will define a smooth function with $f=0$ on $S$. If you want $f$ to be zero only on $S$, you can take an additional open sets $U_0 = \mathbb R^3 - S$, set $f_0 = 1$, and throw $\phi_0f_0$ into your sum.
Here's what I say when I'm teaching this. These comments are usually spread over several lectures, but I'll say them all at once here.
The first use of partitions of unity is usually to construct integrals over manifolds. For example, let $S$ be a surface in $\mathbb{R}^3$, and say we want to integrate a $2$-form $\omega$ over it (or alternatively, integrate the flux of a vector field across it.)
What we would probably do in practice is break $S$ up into patches $S = \bigcup U_i$ with parametrizations $f_i : P_i \to U_i$ by various open sets $P_i \subset \mathbb{R}^2$, pull the differential form back across the parametrization and integrate on each $P_i$. We then need the patches $U_i$ to cover $S$ up to measure $0$. For example, if $S$ is the unit sphere, we might use a single patch in spherical coordinates with $P = (-\pi, \pi) \times (-\pi/2, \pi/2)$ and $f(\theta, \phi) = (\cos \theta \cos \phi, \sin \theta \cos \phi, \sin \phi)$. Alternatively we might parameterize the northern and southern hemispheres separately, taking $P_1 = P_2 = \{ (u,v) : u^2+v^2<1 \}$, with $f_1(u,v) = (u,v,\sqrt{1-u^2-v^2})$ and $f_2(u,v) = (u,v,-\sqrt{1-u^2-v^2})$. Or we might use a stereographic projection: $P = \mathbb{R}^2$, $f(u,v) = \left( \tfrac{2u}{1+u^2+v^2}, \tfrac{2v}{1+u^2+v^2}, \tfrac{1-u^2-v^2}{1+u^2+v^2} \right)$.
This is exactly how to compute integrals in practice. But if we use it as our definition in theory, it becomes messy -- we have to talk about the combinatorics of how the patches fit together, and our integrands will have discontinuities at the boundaries of the patches. We also might convert a compactly supported integrand to a noncompactly supported one -- look at the example of a stereographic projection above -- or convert a bounded integrand to an unbounded one -- look at the example with the square root.
A partition of unity allows us to blend from one patch to another more smoothly. For example, instead of saying that every point is either exactly in the northern hemisphere, or exactly in the southern hemisphere, we have two functions $\phi_1+\phi_2$ with $\phi_1+\phi_2=1$, where $\phi_1$ measures how much we will count the point toward the northern integral, and $\phi_2$ measures how much we will count the point toward the southern integral. This gives integrals that are much worse for hand computation, but have cleaner theoretical properties.
Incidentally, I believe this should be better for machine Monte Carlo integration. That is to say, suppose I want to integrate a $2$-form $\omega$ over the sphere $S^2 \subset \mathbb{R}^3$. One approach would be to parametrize the northern hemisphere and southern hemisphere separately, pulling $\omega$ back to forms supported on discs in two copies of $\mathbb{R}^2$, with discontinuities at the boundary of the disc, and compute these integrals by Monte Carlo. Alternatively, I could use a continuous partition of unity supported on the open sets $z<0.1$ and $z>-0.1$, and pull back by stereographic coordinates to slightly larger discs; my integrands would then be continuous. I believe Monte Carlo integration usually prefers continuous integrands, as that way it is not important to determine exactly which side of the discontinuits a sample random point lies on.
Later uses of partitions of unity are also often of the form "I would like to chop my manifold into pieces, but that is too discontinuous an operation." For example, let's show that every short exact sequence of vector bundles splits. Let $0 \to A \to B \to C \to 0$ be the short exact sequence and let $X$ be the manifold. One would like to cut $X$ into pieces $U_i$ where the bundles are trivial and write down a section $\sigma_i : C \to B$ on each $U_i$. But gluing the $\sigma_i$ together is not continuous. If instead we take a partition of unity $\phi_i$ and write $\sigma = \sum \phi_i \sigma_i$, then $\sigma$ is a smoother version of gluing the $\sigma_i$, and it is continuous.
My mental metaphor for a partition of unity is feathering out paint. If your paint stops abruptly at the edge of the brush stroke, it will leave a visible line even once you paint the wall next to it. Instead, you need to smear out the edge of your stroke so it thins out gradually. I haven't tried bringing a paint can into class though yet!
The idea behind a number of proofs is as follows.
We want to prove theorem "A" for certain functions $f$.
If theorem is true for two functions $f_1$ and $f_2$ , then it is true for $f_1 + f_2$.
The theorem is true for the same class of functions locally. For example, the space might be a manifold and the theorem is true for compactly supported functions.
So you have the theorem true for a covering of the space; and you then construct a partition of unity $\varphi_i$ subordinate to this cover. The theorem being true for each $\varphi_i f$ being compactly supported, it is true for their sum $f$, this sum being locally finite; so at each point it is a finite sum.