Pushforward of Lie Bracket
Well, you see it is much simpler in coordinate independent form. As for diffeomorphism $ f : M \rightarrow N $ you have $ f_* : \mathcal{X}(M) \rightarrow \mathcal{X}(N) $ and hence for $p\in M $ it maps tangent spaces $T_p(M)$ to $T_{f(p)}(N) $ given by for $ g \in C^\infty(N) $ you have $f_* (X)(g)(f(p)) = X(g\circ f)(p)$ hence $ f_*(X)(g)\circ f = X(g\circ f ) $ Thus for $X,Y \in \mathcal{X}(M) $ we have for all $ g \in C^\infty(N) $ \begin{align*} & f_*[X,Y]_{f(p)}(g) = [X,Y]_p(g\circ f) \\ & = X_p(Y(g\circ f))-Y_p(X(g\circ f)) \\ & = X_p(f_*(Y)(g)\circ f) - Y_p(f_*(X)(g)\circ f) \\ & = f_*(X)_{f(p)}(f_*(Y)(g))-f_*(Y)_{f(p)}(f_*(X)(g)) \\ & = [f_*(X),f_*(Y)]_{f(p)} (g) \end{align*} Hence $ f_*[X,Y] = [f_*(X),f_*(Y)] $
To do this computation in coordinates without using functions and points you have to adopt the physicist way of writing things which is messy and unplesant :-) However chain rule takes care of all the evaluation matters.
Let f map $x$ to $y$. We will denote the Jacobian and inverse Jacobian by $\frac{\partial y^j}{\partial x^i}, \frac{\partial x^j}{\partial y^i}$
We will write $\tilde{Z}=f^*Z$ and $\tilde{W}=f^*W$ so the "components change as"
$Z^j = \tilde{Z}^i\frac{\partial x^j}{\partial y^i}$
$W^j = \tilde{W}^i\frac{\partial x^j}{\partial y^i}$
(here we really see Z as a pushforward of $\tilde{Z}$ by the inverse map etc)
Consider $f^*[Z,W]$
$=((Z^i\frac{\partial}{\partial x^i}(W^j) - W^i\frac{\partial}{\partial x^i}(Z^j))\frac{\partial y^l}{\partial x^j}\frac{\partial}{\partial y^l}$
$=((\tilde{Z}^k\frac{\partial x^i}{\partial y^k}\frac{\partial}{\partial x^i}(\tilde{W}^m\frac{\partial x^j}{\partial y^m}) - (\tilde{W}^k\frac{\partial x^i}{\partial y^k}\frac{\partial}{\partial x^i}(\tilde{Z}^m\frac{\partial x^j}{\partial y^m}))\frac{\partial y^l}{\partial x^j}\frac{\partial}{\partial y^l}$
The mixed derivative term is of the form
$((\tilde{Z}^k(\tilde{W}^m\frac{\partial}{\partial y^k}\frac{\partial x^j}{\partial y^m})-(\tilde{W}^k(\tilde{Z}^m\frac{\partial}{\partial y^k}\frac{\partial x^j}{\partial y^m}))$ =0
and the remaining terms give
$=((\tilde{Z}^k\frac{\partial}{\partial y^k}(\tilde{W}^m) - (\tilde{W}^k\frac{\partial}{\partial y^k}(\tilde{Z}^m))\delta_{lm}\frac{\partial}{\partial y^l}$
which is $[f^*Z,f^*W]$. The trick is to use chain rule when ever you want the derivation to be compatible with the function you are applying it to. So the middle expressions might not be completely sensible (just formal expressions to see the steps) but derivations are. Why this only works for diffeomorphisms is the component change rules given above is basically the pushforward expression in coordinates and that expression allows usage of chain rule at certain parts.