Is the Cartesian product of sets associative?
Equality in $(A\times B) \times C=A\times (B\times C)$ holds if, and only if, at least one of the sets is empty. As mentioned already, there is a canonical isomorphism $(A\times B) \times C\to A\times (B\times C)$, mapping $((a,b),c)$ to $(a,(b,c))$. Many authors thus treat $\times $ as if it is an associative operation on sets, though it is not. However, the true reason why one can safely pretend $\times $ is associative and not get into any problems is due to coherence. As a warm-up, assume you have $127$ sets and you multiply them, using $\times $ in some way. You get an incredibly complicated set whose elements contain lots and lots of parenthesis in all sorts of places. You would like to say that this set is essentially just the set of tuples of length $127$ where the $k$-th coordinate is taken from the $k$-th set, but are you now really sure that's the case? The expressions can be so complicated that maybe it is now not so obvious anymore. And besides, do we really only care whether two sets admit a bijection between them in order to identify them? Of course not. So, what is going on here is that it is not so much the sets we should care about as much as it is the functions between them.
Consider again the canonicity of the functions used for the identification $(A\times B) \times C \cong A\times (B\times C)$ but now take four sets. When you have four sets you can multiply them together (in a given fixed order) in five different ways. These five different sets are, we would like to say, essentially the same, but, minding the above, that is not what we wish to say. We don't want to identify them, since they are different. What we want to know is that the canonical functions between them compose coherently. The precise meaning of that is a bit technical, and you may wish to read Mac Lanes's "Categories for the working mathematician" to properly understand coherence, but basically you will discover the definition yourself if you take four sets, multiply them to get five sets, place them at the vertices of a pentagon, connect the dots using the canonical maps, and claim that the different compositions along the perimeter are the same. The point then is Mac Lane's coherence theorem: starting with any finite number of sets, any two compositions of the canonical functions between one way to multiply the sets and another way to multiply the sets are the same.
So, we can safely pretend $\times $ is associative on sets because there are canonical maps between different choices which (and this is really important) always compose to give the same function when used on any number of sets, multiplies in any way you like.
An illustrative example where coherence fails for the 'wrong' choice of canonical maps is when you consider signed elements. Suppose that all your sets have elements, each of which is designated as either positive or negative, and that each element $x$ can change sign to $-x$. Now, you can map $A\times (B\times C)\to (A\times B)\times C$ by $(a,(b,c))\mapsto -((a,b),c)$. Mathematically, it's just as canonical as the other possibility. However, this one is not coherent (you can check that with the pentagon).