Motivation for construction of cross-product (Quaternions?)

Before meeting the cross product, students will have already met the dot product. This is a form of multiplication that takes two vectors and gives you a scalar. It's natural to wonder whether there's also some kind of multiplication that takes two vectors and gives you another vector.

Obviously we could just write down any old function and call it "the cross product". But in order for it to actually be a nice form of multiplication there are a few properties that we would expect it to have:

$$\begin{align*} (\lambda\mathbf u) \times \mathbf v=\lambda(\mathbf u \times \mathbf v)=\mathbf u \times (\lambda\mathbf v)\\ \mathbf u \times (\mathbf v+\mathbf w)=\mathbf u \times \mathbf v+\mathbf u \times \mathbf w\\ (\mathbf u + \mathbf v)\times\mathbf w=\mathbf u \times \mathbf w+\mathbf v \times \mathbf w \end{align*}$$

(these properties are called "being bilinear").

There's one more property that we require in order for the cross product to make sense geometrically, which is that it shouldn't depend on which way you are looking at your problem. If we rotate our vectors and then take their cross product we should get the same answer as if we take their cross product and then rotate. Otherwise people looking at the same problem from different angles would get different answers! In symbols we represent the rotation by some orthogonal matrix $M$ with determinant $1$, and say that for each such matrix we want the following property:

$$M(\mathbf u\times\mathbf v)=(M\mathbf u)\times(M\mathbf v)$$

(This is called "invariance" or sometimes "covariance". Notice that the dot product also makes geometric sense in this way. If you rotate two vectors and then take their dot product you get the same answer as you would have gotten before the rotation. Or in other words $(M\mathbf u)\cdot(M\mathbf v)=u\cdot v$.)

Now here's the clever bit: I claim that the cross product is the only function with these properties. This explains why the cross product is interesting: it's the only form of multiplication that makes any sense at all! (Actually there are also the functions like $\lambda\mathbf u\times\mathbf v$ that are just scalings of the cross product by some factor, but each of these functions can be written in terms of any of the others and so we can just pick the usual cross product to be our favourite, and work in terms of that.)


Proof

I'll show that if we have a function, $\times$, that is bilinear and invariant (i.e. it obeys the four equations listed above) then it is in fact the cross product. We'll work in terms of the usual basis vectors $\mathbf i$, $\mathbf j$ and $\mathbf k$.

First we'll try to work out what $\mathbf i \times \mathbf j$ is. Let $M$ be the rotation of $180^\circ$ about the $k$-axis. Then $M\mathbf i=-\mathbf i$ and $M\mathbf j=-\mathbf j$. So we have

$$M(\mathbf i \times \mathbf j)=(M\mathbf i)\times(M\mathbf j)=(-\mathbf i)\times(-\mathbf j)=\mathbf i\times\mathbf j$$

(the last step used bilinearity to move the minus signs out so they could cancel). This means that $\mathbf i \times \mathbf j$ is fixed by the rotation $M$, and so must lie on the $\mathbf k$-axis. As I said before, we're going to allow ourselves to pick our favourite scaling, so since we know $\mathbf i \times \mathbf j$ is on the $\mathbf k$-axis we might as well assume that $\mathbf i \times \mathbf j=\mathbf k$.

There's a rotation (of $120^\circ$ about $\mathbf i+\mathbf j+\mathbf k$) that takes $\mathbf i$ to $\mathbf j$, $\mathbf j$ to $\mathbf k$, and $\mathbf k$ to $\mathbf i$. Applying invariance under this matrix to our equation $\mathbf i \times \mathbf j=\mathbf k$ gives us $\mathbf j \times \mathbf k=\mathbf i$. Applying it again gives $\mathbf k \times \mathbf i=\mathbf j$.

There's also a rotation (of $180^\circ$ about $\mathbf i+\mathbf j$) that takes $\mathbf i$ to $\mathbf j$, $\mathbf j$ to $\mathbf i$, and $\mathbf k$ to $-\mathbf k$. Applying invariance under this matrix to our equation $\mathbf i \times \mathbf j=\mathbf k$ gives us $\mathbf j \times \mathbf i=-\mathbf k$. Similarly we have $\mathbf k \times \mathbf j=-\mathbf i$ and $\mathbf i \times \mathbf k=-\mathbf j$.

Finally we want to know what $\mathbf i \times \mathbf i$ is. Let $M$ be the $180^\circ$ rotation about the $\mathbf k$-axis, as before. Then

$$M(\mathbf i \times \mathbf i)=(M\mathbf i)\times(M\mathbf i)=(-\mathbf i)\times(-\mathbf i)=\mathbf i\times\mathbf i$$

so $\mathbf i \times \mathbf i$ is fixed by $M$ and therefore lies on the $\mathbf k$-axis. But the same argument applied with a rotation about the $\mathbf j$-axis shows that $\mathbf i \times \mathbf i$ lies on the $\mathbf j$-axis too! These two axes only intersect at $\mathbf 0$. So $\mathbf i \times \mathbf i=\mathbf 0$ and by the same argument $\mathbf j \times \mathbf j=\mathbf 0$ and $\mathbf k \times \mathbf k=\mathbf 0$.

Now since we know how to cross product any two basis vectors we can calculate the cross product of any two vectors by multiplying out (using bilinearity):

$$\begin{align*}(u_i\mathbf i+u_j\mathbf j+u_k\mathbf k)\times(v_i\mathbf i+v_j\mathbf j+v_k\mathbf k)= &u_iv_i\mathbf i\times\mathbf i+u_iv_j\mathbf i\times\mathbf j+u_iv_k\mathbf i\times\mathbf k\\ +&u_jv_i\mathbf j\times\mathbf i+u_jv_j\mathbf j\times\mathbf j+u_jv_k\mathbf j\times\mathbf k\\ +&u_kv_i\mathbf k\times\mathbf i+u_kv_j\mathbf k\times\mathbf j+u_kv_k\mathbf k\times\mathbf k\\ =(u_jv_k-u_kv_j)\mathbf i+(u_kv_i-u_iv_k)&\mathbf j+(u_iv_j-u_jv_i)\mathbf k \end{align*}$$

This is the formula for the cross product.


The above was a rewrite of my original answer which said more or less the same thing as above but in more formal terms. I'll put my original answer here because I think some people reading this might like to see the technical details:

Given a $3$-dimensional oriented real inner-product space $V$ the group of symmetries preserving the inner-product and orientation is $\mathrm{SO}(V)$. The invariant tensors under $\mathrm{SO}(V)$ are $\delta_{ij}$, $\delta^{ij}$, and $\varepsilon_{ijk}$, along with the things they generate like $\delta^{ij}\varepsilon_{klm}$ and so on.

The tensors $\delta_{ij}$ and $\delta^{ij}$ are the inner-product and the inner-product induced on the dual space. These aren't very interesting because we defined $\mathrm{SO}(V)$ to preserve these, so we already knew that we were going to get them. But the tensor $\varepsilon_{ijk}$ is in some sense new. Therefore we are motivated to investigate $\varepsilon_{ijk}$ or equivalently the bilinear map $V\times V\rightarrow V$ given by $(v\times w)^i=\delta^{ij}\varepsilon_{jkl}v^kw^l$. This is the cross product.


Here is how I usually introduce the cross product.

Given two vectors $u=(u_1,u_2,u_3), v=(v_1,v_2,v_3)$ we seek another vector $w=(x,y,z)$ which is perpendicular to both. This means

$$u \cdot w= 0 \Rightarrow u_1x+u_2y+u_3z=0 \\ v \cdot w =0 \Rightarrow v_1x+v_2y+v_3z=0 $$

Now, all you have to do is solve this system of equations.Under the extra assumption that $u,v$ are linearly independent, the solution is exactly $t (u \times v)$.

One neat way to see this, is to start assuming that $u_1v_2-u_2v_1 \neq 0$ (or similarly for any pair of indices, which happens at least once for linearly independent vectors), and solve $$u_1x+u_2y=-u_3z \\ v_1x+v_2y=-v_3z $$ using Cramer's rule. The formulas for the cross product pop out immediately.


To me, the wedge product is more intuitive than the cross product. And as it turns out, geometric algebra provides exactly the way to get a vector perpendicular to the bivector $u\wedge v$: taking the algebraic dual. So I define the cross product as $$\bbox[5px,border:2px solid red]{u\times v := (u\wedge v)I^{-1}}$$ where $I$ is the positively oriented unit pseudoscalar of $\Bbb R^3$ (with the usual orientation).

enter image description here

The great thing about this definition is that it immediately tells us that the norm of the cross product is $\|u\|\|v\|\sin(\theta)$ and that it is orthogonal to both $u$ and $v$ (from the properties of the wedge product and the algebraic dual, respectively). So essentially we have to do the derivation you asked for in reverse. We'd like to show that this definition produces the correct components of the cross product. Here's a (pedantically) step-by-step derivation:

$$\begin{align}u\wedge v &= (u_1e_1 + u_2e_2+u_3e_e)\wedge (v_1e_1+v_2e_2+v_3e_3) \\ &= u_1v_1(e_1\wedge e_1) + u_1v_2(e_1\wedge e_2) + u_1v_3(e_1\wedge e_3) + \cdots \\ &= 0 + u_1v_2(e_1\wedge e_2) - u_1v_3(e_3\wedge e_1) + \cdots \\ &= (u_2v_3-u_3v_2)e_2\wedge e_3 + (u_3v_1-u_1v_3)e_3\wedge e_1 + (u_1v_2-u_2v_1)e_1\wedge e_2\end{align}$$

$$\begin{align}(u\wedge v)I^{-1} &= \big[(u_2v_3-u_3v_2)e_2\wedge e_3 + (u_3v_1-u_1v_3)e_3\wedge e_1 + (u_1v_2-u_2v_1)e_1\wedge e_2\big](e_3\wedge e_2\wedge e_1) \\ &= \big[(u_2v_3-u_3v_2)e_2e_3 + (u_3v_1-u_1v_3)e_3e_1 + (u_1v_2-u_2v_1)e_1e_2\big](e_3e_2e_1) \\ &= (u_2v_3-u_3v_2)e_2e_3e_3e_2e_1 + (u_3v_1-u_1v_3)e_3e_1e_3e_2e_1 + (u_1v_2-u_2v_1)e_1e_2e_3e_2e_1 \\ &= (u_2v_3-u_3v_2)e_1 + (-1)^2(u_3v_1-u_1v_3)e_3e_3e_2e_1e_1 + (-1)^4(u_1v_2-u_2v_1)e_1e_1e_2e_2e_3 \\ &= (u_2v_3-u_3v_2)e_1 + (u_3v_1-u_1v_3)e_2 + (u_1v_2-u_2v_1)e_3\end{align}$$

as we expected.


Note: The above seems to skirt the issue by simply replacing the cross product with another: the wedge product. But I'd argue any day that the wedge product is very well motivated, intuitive, and way more useful than the cross product. So while this definition does require a student to spend a little longer learning about the wedge product and dual than if (s)he just learned the cross product, the result should be a much better understanding of the geometric aspects of linear algebra.