A result of Schützenberger on commutators and powers in free groups
The case $n=2$ (originally due to Lyndon) admits a very nice geometric argument: one notes that elements $a,b,c$ with $[a,b]=c^2$ lead to a map from the surface $\Sigma_{-1}$ of Euler characteristic -1 to a graph. Pulling back midpoints of edges, one obtains essential, two-sided, simple closed curves on $\Sigma_{-1}$. Pinching these, Euler characteristic shows that one of the resulting components must be a projective plane, and the result follows.
For $n> 2$, Duncan--Howie proved the stronger theorem that, for any non-trivial element $c$ of $F_2$, the stable commutator length of $c$ is at least $1/2$. (In fact, by passing to covers, one can also deduce the $n=2$ case from the Duncan--Howie theorem.) When you decode it, their proof is quite geometric, relying on defining a sort of combinatorial vector field on surfaces.
Louder and I gave a sort of generalization of Duncan--Howie's proof in our work on one-relator groups. Again, the idea is to count Euler characteristic using some sort of geometric data (we call it a stacking). We don't spell it out explicitly, but you can use our ideas to give a geometric proof of the Duncan--Howie theorem, and in particular of the result that you want.
(a) On the free 2-step nilpotent group $L$ on $(u,v)$, the commutator $z=[u,v]$ is not a proper power.
Indeed, since $L/[L,L]$ is torsion-free, any root of $z$ should lie in $[L,L]$ and the latter is readily seen to be the cyclic group generated by $z$. (Remark: $L$ is isomorphic to the integral 3-dimensional Heisenberg group.)$\qquad\Box$
I'll use three other very classical facts, without proof: the hardest (b) has both geometric and combinatorial proofs (the simplest using that $\pi_1$ of graphs are free), while (c),(d) are elementary exercises ((c) follows from Malcev's more general results but is very easy by hand here).
(b) Subgroups of free groups are free.
(c) $L$ is Hopfian (all its surjective group endomorphisms are automorphisms).
(d) If $N$ is a nilpotent group and $M$ a subgroup generating $N$ modulo $[N,N]$ (that is, $M[N,N]=N$), then $M=N$.
This allows to prove the result.
In a free group $F$, if $a,b,c$ are two elements and $[a,b]\neq 1$ then $[a,b]$ is not a proper power $c^n$ of $c$.
By (a), we can suppose that $F$ is generated by $a,b,c$. If $d$ is the rank of $F$, it is also the rank of its abelianization, and the relation $[a,b]=c^n$ then forces $d\le 2$. Also $d\ge 2$ since $a,b$ don't commute. So $F$ is free on 2 generators (up to now this is exactly Baumslag's argument).
Hence $L=F/[F,[F,F]]$ is free 2-step nilpotent on 2 generators. Denote by $A,B,C$ the images of $a,b,c$ in $L$. Since $L/[L,L]$ is torsion-free and $C^n\in [L,L]$, we have $C\in [L,L]$. By (d), we deduce that $L$ is generated by $A,B$. So the endomorphism of $L$ mapping the two free generators to $A$ and $B$ is surjective. By (c) deduce that $L$ is free 2-step nilpotent on $(A,B)$. Since $[A,B]=C^n$, we can use (a) to get a contradiction. (Baumslag's conclusion of the argument also uses a 2-step nilpotent group, but I haven't followed as it's more technical)
There is a visualisation of Schützenberger's observation based on a variant of the car-crash lemma that says that
for any multiple motion on a map on a closed oriented surface $S$, the number of points of complete collision is at least $\chi(S)+\sum\limits_D(d_D-1)$, where the summation runs over all faces $D$ of the map and $d_D$ is the number of cars moving around a face $D$.
See [this self-advertisement] for exact definitions.
So, if a non-identity commutator in a free group $F(x,y,\dots)$ is, e.g., a cube of a word $c$, we obtain a one-face map on a torus, where the label of the face is $c^3$. Let three cars move around this face counter-clockwise with a constant speed one edge per a minute; at a moment $t$ each car moves along an edge labelled by the $[t]$th letter of the word $c$ (where the integer part $[t]$ of $t\in\mathbb R$ is counted modulo the length of $c$).
The car-crash lemma asserts that $$ \small \pmatrix{ \hbox{the number of points} \\\\ \hbox{of complete collision} } \geqslant \begin{pmatrix} \hbox{the Euler characteristic} \\\\ \hbox{of the torus,} \\\\ \hbox{i.e. 0} \end{pmatrix} + \pmatrix{ \hbox{$d_D$, i.e. the number of cars} \\\\ \hbox{moving around the unique face $D$,} \\\\ \hbox{i.e. 3} } -1=2. %} $$ On the other hand, a collision inside an edge is impossible:
such a collision would imply that some letter of the word $c$ is $x$ and $x^{-1}$ simultaneously. A similar argument shows that a collision at a vertex cannot occur too. This contradiction proves that a commutator cannot be a cube.
This approach was used in [this self-advertisement], [this self-advertisement], and [this self-advertisement] to obtain generalisations of Schützenberger's observation and Duncan--Howie's theorem (mentioned by Henry Wilton).