Just How Strong is Associativity?

A particularly notorious example of non-associative binary operations are addition and multiplication of numbers - when done in a computer using floating point arithmetic. That is, the results of x = a+(b+c); and x=(a+b)+c; may be very different. One needs to be careful in programming additions to avoid excessive rounding errors in such situations.


Associativity of a binary operation allows us to represent it with mappings

Studying the transformations of a set $X$ into itself (we can write $\mathrm{Hom}(X,X)$) is a very natural thing to do: how can $X$ be rearranged? Its binary operation (function composition) is associative by nature.

In group theory, this leads to Cayley's theorem that every group $G$ is a subgroup of $\mathrm{Sym}(G)\subset \mathrm{Hom}(G,G)$.

But after thinking briefly, you can see that pretty much the same line of thought applies to any semigroup $S$ acting on itself via a binary operation. The only caveat is that you need a condition to ensure $as=bs$ for all $s\in S$ implies that $a=b$ so that the map is injective.

If $S$ is a monoid, or if $S$ is cancellative, then we would get this condition for free, and $S$ can be injected into $\mathrm{Hom}(S,S)$ using the same scheme as Cayley's theorem.


(This block added later:) As pointed out in the comments, it now seems painfully obvious you can adjoin an identity to make any semigroup into a monoid, and then represent the semigroup as a set of endomorphisms of the monoid, so all semigroups are covered as well. So, associativity turns out to characterize which binary operations that can be represented this way and those that can't.


The same could not be said for an operation on a set $X$ which isn't associative. There is no way to find an isomorphic copy of it in $\mathrm{Hom}(Y,Y)$ for any set $Y$ at all.

Nonassociativity means the operation is going to be too pathological for this nice type of representation.

I have the suspicion that one might be able to frame this in terms of representable functors, but after thinking a while (being the armchair category theorist that I am) I couldn't decide if the connection was real.


For all associative binary operations on a set, there is a faithful representation as a sub-semigroup of the semigroup $(X^X,\circ)$ for some set $X$.

That is the nature of associativity, on some level - it is function composition.

This puts us into the area of category theory, too. It would be probably not very useful to do category theory without associativity of composition.

The most basic place I've seen non-associativity is in $\lambda$-calculus and/or combinatory logic, where $ab$ represents application of $a$ to $b$.

If you look at lambda calculus, you can think of it as $a\star b=\phi_a(b)$. That's actually true for all binary operations, of course - we can define $\phi_a(b)=a\star b$. For associative operators, however, we have the lovely feature $\phi_a\circ \phi_b = \phi_{a\star b}$.