Why should a non-commutative operation even be called "multiplication"?

Terms in mathematics do not necessarily have absolute, universal definitions. The context is very important. It is common for a term to have similar but not identical meanings in multiple contexts but it is also common for the meaning to differ considerably. It can even differ from author to author in the same context. To be sure of the meaning, you need to check the author's definition.

It might be tempting to demand that multiplication be commutative and another term be used when it isn't but that would break some nice patterns such as the real numbers, to the complex numbers, to the quaternions.

In day to day life, multiplication is commutative but only because it deals only with real numbers. As you go deeper into maths, you will need to unlearn this assumption. It is very frequent that something called multiplication is not commutative.


In school you might have also learned that multiplication or addition are defined for "numbers", but vectors, matrices, and functions are not numbers, why should they be allowed to be added or multiplied to begin with?

It turns out that sounds and letters form words which we use in context to convey information, usually in a concise manner.


In school, for example, most of your mathematical education is about the real numbers, maybe a bit about the complex numbers. Maybe you even learned to derive a function.

But did it ever occur to you that there are functions from $\Bbb R$ to itself such that for every $a<b$, the image of the function on $(a,b)$ is the entire set of real numbers?

If all functions you dealt with in school were differentiable (or at least almost everywhere), why is that even a function? What does it even mean for something to be a function?

Well. These are questions that mathematicians dealt with a long time ago, and decided that we should stick to definitions. So in mathematics we have explicit definitions, and we give them names so we don't have to repeat the definition each time. The phrase "Let $H$ be a Hilbert space over $\Bbb C$" packs in those eight words an immense amount of knowledge, that usually takes a long time to learn, for example.

Sometimes, out of convenience, and out of the sheer love of generalizations, mathematicians take a word which has a "common meaning", and decide that it is good enough to be used in a different context and to mean something else. Germ, stalk, filter, sheaf, quiver, graph, are all words that take a cue from the natural sense of the word, and give it an explicit definition in context.

(And I haven't even talked about things which have little to no relation to their real world meaning, e.g. a mouse in set theory.)

Multiplication is a word we use in context, and the context is usually an associative binary operation on some set. This set can be of functions, matrices, sets, etc. If we require the operation to have a neutral element and admit inverses, we have a group; if we adjoin another operator which is also associative, admits inverses, and commutative and posit some distributivity laws, we get a ring; or a semi-ring; or so on and so forth.

But it is convenient to talk about multiplication, because it's a word, and in most cases the pedagogical development helps us grasp why in generalizations we might want to omit commutativity from this operation.


Terminology for mathematical structures is often built off of, and analogized to, terminology for "normal" mathematical structures such as integers, rational numbers, and real numbers. For vectors over real numbers, we have addition already defined for the coordinates. Applying this operation and adding components termwise results in a meaningful operation, and the natural terminology is to refer to that as simply "addition". Multiplying termwise result in an operation that isn't as meaningful (for one thing, this operation, unlike termwise addition, is dependent on the coordinate system). The cross product, on the other hand, is a meaningful operation, and it interacts with termwise addition in a manner similar to how real multiplication interacts with real addition. For instance, $(a+b)\times c = a \times c + b \times c$ (distributive property).

For matrices, we again have termwise addition being a meaningful operation. Matrices represent linear operators, and the definition of linearity includes many of the properties of multiplication, such as distribution: A(u+v) = A(u) + A(v). Thus, it's natural to treat application of a linear operator as "multiplying" a vector by a matrix, and from there it's natural to define matrix multiplication as composition of the linear operators: (A*B)(v) = (A(B(v)).