Tensor product of monoids and arbitrary algebraic structures
Claim. The tensor product of monoids is not associative: We have $\mathbb{N} \otimes (\mathbb{N}^2 \otimes \mathbb{N}^2) \not\cong (\mathbb{N} \otimes \mathbb{N}^2) \otimes \mathbb{N}^2$.
First, let us look in general what $\mathbb{N} \otimes -$ does. This was already mentioned without proof by the OP. If $M$ is a monoid and $\langle u \rangle = \mathbb{N}$ is the free monoid with one generator, then $\mathbb{N} \otimes M$ is generated by $\{u \otimes m : m \in M\}$ subject to the relation that $u^p \otimes -$ is a homomorphism for all $p \geq 0$, which means that $u \otimes -$ is a homomorphism and $u \otimes (m_1 m_2)^p = u \otimes m_1^p m_2^p$ holds for all $m_1,m_2 \in M$. Thus, we have an isomorphism $\mathbb{N} \otimes M \cong M / \bigl((m_1 m_2)^p = m_1^p m_2^p\bigr)$.
Definition. Let us call a monoid power-homomorphic if for each each $p \geq 0$ the $p$th power map $m \mapsto m^p$ is a homomorphism. Thus, $\mathbb{N} \otimes M$ is the universal power-homomorphic quotient of $M$. Then $M$ is power-homomorphic iff $M \cong \mathbb{N} \otimes M$ iff the canonical homomorphism $M \to \mathbb{N} \otimes M$ is an isomorphism.
Now let us dismantle the explicit construction of the tensor product of monoids.
Definition. Let $M,N$ be two monoids. Let $F$ denote the free monoid on the set $|M| \times |N|$. For $w,w' \in F$ we define $w \sim_1 w'$ when we can write $w$ and $w'$ in one of the following forms:
- $w = A (m,nn') B$ and $w' = A (m,n) (m,n') B$
- $w = A (mm',n) B$ and $w' = A (m,n) (m',n) B$
- $w = A (m,1) B$ and $w' = AB$
- $w = A (1,n) B$ and $w' = A B$
Here, $A,B \in F$, $m,m' \in M$ and $n,n' \in N$. Let $\sim_2$ denote the symmetric closure of $\sim_1$, so that $w \sim_2 w'$ holds iff $w \sim_1 w'$ or $w' \sim_1 w$. Finally, let $\sim_3$ denote the transitive closure of $\sim_2$, so that $w \sim_3 w'$ holds iff there is a chain $w = w_0 \sim_2 w_1 \sim_2 \dotsc \sim_2 w_n = w'$. Observe that $\sim_3$ is a congruence relation on $F$, in fact the smallest one which makes $(m,n)$ a homomorphism in both variables. Thus, $M \otimes N = F/{\sim_3}$.
Definition. Let $M$ be a monoid. We call $m \in M$ strongly irreducible if $m \neq 1$ and if whenever $m = ab$, then $a=1$ or $b=1$.
Basically, I introduce this strong property because I do not want to bother about units.
Lemma. Let $M,N$ be two monoids with the property $xy=1 \implies x=y=1$. Let $x_1,\dotsc,x_r \in M$ (resp. $y_1,\dotsc,y_r$) be a sequence of strongly irreducible elements in $M$ (resp. $N$), such that $x_i \neq x_{i+1}$ (resp. $y_i \neq y_{i+1}$) for $1 \leq i < r$. Consider the element $w := (x_1,y_1) \cdots (x_r,y_r)$ in the free monoid $F$ on the set $|M| \times |N|$. Then the $\sim_3$-equivalence class of $w$ consists precisely of those elements of the form $p_0 (x_1,y_1) p_1 \cdots (x_r,y_r) p_r$, where each $p_i$ is a (possibly empty) product of elements of the form $(1,n)$ or $(m,1)$.
Before we give the proof, let us demonstrate how the claim follows:
Proof of the claim. Since $\mathbb{N}^2$ is power-homomorphic (even commutative), we have $\mathbb{N} \otimes \mathbb{N}^2 \cong \mathbb{N}^2$. Thus, we need to prove that $\mathbb{N} \otimes (\mathbb{N}^2 \otimes \mathbb{N}^2) \not\cong \mathbb{N}^2 \otimes \mathbb{N}^2$, i.e. that $\mathbb{N}^2 \otimes \mathbb{N}^2$ is not power-homomorphic. We denote the canonical generators of the first tensor factor by $\mathbb{N}^2 = \langle x_1,x_2 : x_2 x_1 = x_1 x_2 \rangle$ and of the second tensor factor by $\mathbb{N}^2 = \langle y_2,y_2 : y_1 y_2 = y_2 y_1 \rangle$. Notice that these generators are strongly irreducible. The Lemma implies that in the free monoid on $\mathbb{N}^2 \times \mathbb{N}^2$ the element $(x_1,y_1) (x_2,y_2) (x_1,y_1) (x_2,y_2)$ is not equivalent to $(x_1,y_1) (x_1,y_1) (x_2,y_2) (x_2,y_2)$. This means that in $\mathbb{N}^2 \otimes \mathbb{N}^2$ we have $((x_1 \otimes y_1) (x_2 \otimes y_2))^2 \neq (x_1 \otimes y_1)^2 (x_2 \otimes y_2)^2$. Thus, $\mathbb{N}^2 \otimes \mathbb{N}^2$ is not power-homomorphic. $\checkmark$
Now we prove the Lemma.
Proof of the Lemma. It is clear that every element of the form $p_0 (x_1,y_1) p_1 \cdots (x_r,y_r) p_r$ is $\sim_3$-equivalent to $w$. Conversely, let $w''$ be $\sim_3$-equivalent to $w$. There is a chain of $\sim_2$-equivalences from $w$ to $w''$. By induction on its length, we see that there is some $w' = p_0 (x_1,y_1) p_1 \cdots (x_n,y_n) p_n$ such that $w' \sim_2 w''$. We have 2 cases, each one having 4 subcases:
Case 1: $w' \sim_1 w''$.
Case 1a: $w' = A (m,nn') B$, $w'' = A (m,n) (m,n') B$: Then $(m,nn')$ is either some $(x_i,y_i)$ or one of the pairs $(m,1)$ in some $p_i$, or one of the pairs $(1,n'')$ in some $p_i$. In the first case, we have $n=1$ or $n'=1$ since $y_i$ is strongly irreducible, hence $w'' = A (x_i,1) (x_i,y_i) B$ or $w'' = A (x_i,y_i) (x_i,1) B$. We deduce that $w''$ has the desired form, because it results from $w'$ by adding the factor $(x_i,1)$ somewhere. In the second case, we have $nn'=1$, hence $n=n'=1$, hence $w'' = A(m,1)(m,1)B$ has the desired form, because it results from $w'$ by adding the factor $(m,1)$ somewhere. In the third case, we have $m=1$, hence $w'' = A(1,n) (1,n') B$ has the desired form, because it results $w'$ by expanding $(1,nn')$ in $p_i$ to $(1,n) (1,n')$.
Case 1b: $w' = A (mm',n) B$, $w'' = A (m,n) (m',n) B$: Analogous to 1a.
Case 1c: $w' = A (m,1) B$, $w'' = AB$: Then $(m,1)$ is either some $(x_i,y_i)$, which is not possible since $y_i \neq 1$, or $(m,1)$ is some pair in some $p_i$. Then $w'' = AB$ has the desired form, because it resulsts from $w'$ by removing a factor of $p_i$.
Case 1d: $w' = A (1,n) B$, $w'' = AB$: Analogous to 1c.
Case 2: $w'' \sim_1 w'$.
Case 2a: $w'' = A (m,nn') B$, $w' = A (m,n) (m,n') B$: Then either $(m,n)=(x_i,y_i)$ and hence $(m,n')$ is the first factor of $p_i$, or $(m,n)$ is some factor of some $p_i$ and hence $(m,n')$ is the next factor, or even $(x_{i+1},y_{i+1})$. If the first is true, then $m=x_i$, $n=y_i$ and $n'=1$ (since $m \neq 1$). Then $w'' = A (x_i,y_i) B$ results from $w'$ by removing the first factor of $p_i$, hence has the desired form. If the second is true, and both $(m,n)$, $(m,n')$ are factors of $p_i$, then $m=1$ and hence $(m,nn') = (1,nn')$ is a allowed factor, or $n=1$ and hence $(m,nn') = (m,n')$ is an allowed factor as well. Now if $(m,n')=(x_{i+1},y_{i+1})$, then $m \neq 1$ and thus $n=1$, so that $w'' = A (x_{i+1},y_{i+1}) B$ results from $w'$ by removing the last factor of $p_i$, which thus has the desired form.
Case 2b: $w'' = A (mm',n) B$, $w' = A (m,n) (m',n) B$: Analogous to 2a.
Case 2c: $w'' = A (m,1) B$, $w' = AB$. This means that we insert some $(m,1)$ somewhere in one of the $p_i$ in $w'$, so that $w''$ has the desired form.
Case 2d: $w'' = A (1,n) B$, $w' = AB$: Analogous to 2c.
This finishes the proof. $\checkmark$
Of course, it would be nice to have a better understanding of $\mathbb{N}^2 \otimes \mathbb{N}^2$ which then directly gives the desired inequality of squares, but I wasn't able to find such a proof. Although you can write down a presentation of $\mathbb{N}^2 \otimes \mathbb{N}^2$ with 4 generators and an infinite set of relations, it is not clear a priori how to exclude unexpected relations which might follow from these relations, i.e. it is not clear a priori how to describe the generated congruence relation. But perhaps someone finds a better way.