I don't get the relationship between differentials, differential forms, and exterior derivatives.
For a beginner just starting to come to grips with these ideas, I think the most useful answer is this:
Except in one special situation (described below), there is essentially no relationship between the exterior derivative of a differential form and the differential (or pushforward) of a smooth map between manifolds, other than the facts that they are both computed locally by taking derivatives and are both commonly denoted by the symbol $d$.
Differential geometry is loaded with notation, and sometimes we just run out of letters, so we have to overload a symbol by interpreting it in different ways in different situations. The fact that two things are represented by the same symbol doesn't always mean that they're "the same" in any deep sense.
The one situation in which the two concepts are directly related is for a smooth map $f\colon M\to\mathbb R$. In this case, we can consider $f$ either as a smooth map between manifolds or as a $0$-form. Considering it as a smooth map, for each $x\in M$, the pushforward is a linear map $df_x\colon T_xM\to T_{f(x)}\mathbb R$. Considering it as a $0$-form, its differential $df$ is a $1$-form, which means that for each $x\in M$ we have a linear functional $df_x\colon T_xM\to \mathbb R$. The link between the two is the fact that, because $\mathbb R$ is a vector space, there's a canonical identification $T_{f(x)}\mathbb R\cong\mathbb R$, and under that identification these two versions of $df_x$ are exactly the same map.
The excellent answer by @user86418 explains a sophisticated context in which both pushforwards and exterior derivatives can be viewed as special cases of a more general construction; but that's a context I wouldn't recommend that a beginner spend much time trying to come to terms with.
$\newcommand{\Reals}{\mathbf{R}}\newcommand{\Basis}{\mathbf{e}}\renewcommand{\phi}{\varphi}$As you may know, differential forms on $M$ can be "vector-valued", i.e., they can take values in some vector bundle $E \to M$. In the simplest case, $E$ is the trivial real line bundle, and "$E$-valued differential forms" are just "differential forms". In general, if $(\Basis_{j})_{j=1}^{\ell}$ is a local frame for $E$ in some trivializing coordinate neighborhood $U$ (i.e., both $E$ and $TM$ are trivial over $U$), an $E$-valued $k$-form looks like $$ \sum_{j=1}^{\ell} \omega_{j} \Basis_{j} $$ with the $\omega_{j}$ ordinary $k$-forms in $U$.[1]
If $\phi:M \to N$ is a smooth map, and if $p:E \to N$ is a vector bundle, there is a pullback bundle $\phi^{*}E \to M$ whose total space is $$ \{(x, v) \text{ in } M \times E: \phi(x) = p(v)\} $$ and whose projection map is projection to the first factor. Intuitively, put a copy of the fibre $E_{\phi(x)}$ over $x$ for each $x$ in $M$.
The push-forward is sometimes introduced informally as a mapping $\phi_{*}:TM \to TN$, but technically that's not right. Actually, $\phi_{*}$ takes values in $\phi^{*}TN$ and is a mapping between vector bundles over $M$.
The differential $d\phi$ may be viewed as a $1$-form on $M$ taking values in $\phi^{*}TN$, the pullback of the tangent bundle by $\phi$. That's a fancy way of saying that if $v$ is a tangent vector to $M$ at a point $x$, then $d\phi(x)(v)$ is an element of $T_{\phi(x)}N$, and is linear in $v$. (Of course, $d\phi(x)(v)$ actually measures something useful, the "rate of change" of $\phi$ at $x$ in the direction $v$.)
For a real-valued function $f$, the differential $df$ is indeed the exterior derivative, by the usual definition of the exterior derivative. To mesh this with the preceding item, note that $T\Reals$ is a trivialized vector bundle: We know what "$1$" means when we speak of real-valued functions and forms, so $T\Reals = \Reals \times \Reals$ and consequently $f^{*}T\Reals = M \times \Reals$. That means we can view the differential $df$ as a $1$-form with values in $M \times \Reals$, i.e., as a real-valued $1$-form.
Your third bullet point may be answered by the first bullet point above, but it's probably worth mentioning that a smooth map $\phi:M \to N$ could be termed an "$N$-valued $0$-form", whose differential is $d\phi$.
Notationally, I think of $d\phi(x)$ as the linear fibre map sending a tangent vector $v$ in $T_{x}M$ to $d\phi(x)(v)$ in $T_{\phi(x)}N$, while the push-forward $\phi_{*}$ is the bundle map sending $(x, v)$ in $TM$ to $\bigl(\phi(x), d\phi(x)(v)\bigr)$ in $TN$. So, they're not quite identical if you need to split hairs, but in practice you can usually talk only about the push-forward.
- Technically, an $E$-valued differential $k$-form is a smooth section of the vector bundle $\bigwedge^{k}T^{*}M \otimes E$. If you change trivializations, the form "coefficients" transform like ordinary differential forms, and the "vector parts" transform like sections of $E$. In general, there's no natural exterior derivative on $E$-valued forms; you have to know how to differentiate sections of $E$, as well. If $E$ has locally constant transition functions, however, then there is a natural exterior derivative, given by $$ d\left(\sum \omega_{j} \Basis_{j}\right) = \sum (d\omega_{j}) \Basis_{j}. $$ One can't mention this without making a plug for holomorphic vector bundles: Holomorphic functions act like constants with respect to the Dolbeault operator $\bar{\partial}$ (i.e., $f$ is holomorphic if and only if $\bar{\partial}f = 0$), so there's a well-defined $\bar{\partial}$ operator $$ \bar{\partial}\left(\sum \omega_{j} \Basis_{j}\right) = \sum (\bar{\partial}\omega_{j}) \Basis_{j} $$ for differential forms with values in a holomorphic vector bundle $E$. (!)