Truly intuitive geometric interpretation for the transpose of a square matrix

One geometric description of $A^T$ can be obtained from the SVD decomposition (this will be similar to your third point). Any square matrix $A \in M_n(\mathbb{R})$ can be written as a product $A = S \Lambda R^T$ where $\Lambda$ is diagonal with non-negative entries and both $S,R$ are orthogonal matrices. The diagonal entries of $\Lambda$ are called the singular values of $A$ while the columns of $S$ and $R$ are called left singular vectors of $A$ and right singular vectors of $A$ respectively and they can be computed explicitly (or at least as explicitly as one can compute eigenvalues and eigenvectors). Using this decomposition, we can describe $A^T$ as

$$ A^T = (S\Lambda R^T)^T = R \Lambda S^T. $$

What does this mean geometrically? Assume for simplicity that $n = 2$ (or $n = 3$) and that $\det S = \det R = 1$ so $R,S$ are rotations. If $A$ is symmetric, we can write $A = R \Lambda R^T$ where $R$ is a rotation and $\Lambda$ is diagonal. Geometrically, this describes the action of $A$ as the composition of three operations:

  1. Perform the rotation $R^T$.
  2. Stretch each of the coordinate axes $e_i$ by a factor $\lambda_i$ (which is the $(i,i)$-entry of $\Lambda$).
  3. Finally, perform the rotation $R$ which is the inverse of the rotation $R^T$.

In other words, $A$ acts by rotating, stretching the standard basis vectors and then rotating back.

When $A$ is not symmetric, we can't have such a description but the decomposition $A = S \Lambda R^T$ gives us the next best thing. it describes the action of $A$ as the composition of three operations:

  1. First, perform the rotation $R^T$.
  2. Stretch each of the coordinate axes $e_i$ by a factor $\sigma_i$ (which is the $(i,i)$-entry of $\Lambda$).
  3. Finally, perform a different rotation $S$ which is not necessarily the inverse of $R^T$.

Unlike the case when $A$ was symmetric, here $R \neq S$ so the action of $A$ is a rotation, followed by stretching and then by another rotation. The action of $A^T = R\Lambda S^T$ is then obtained by reversing the roles of $R,S$ while keeping the same stretch factors. Namely, $A$ rotates by $R^T$, stretches by $\Lambda$ and rotates by $S$ while $A^T$ rotates by $S^T$, stretches by $\Lambda$ and rotates by $R$.


When trying to grasp the relation between $A$, $A^T$ and $A^{-1}$, I created the attached plot
For $A^T$ this reads:

  • $\mathcal{r}_{U^T}$ is the rotation performed by $U^T$
  • $\mathcal{s}_{\Sigma}$ is the scaling performed by $\Sigma$
  • $\mathcal{r}_{V}$ is the rotation performed by $V$

The three axes show the SVD-decomposition of the three incarnations of $A$.

  • A green line between two axes indicates equality.
  • A red line indicates a contraposition.

In short, this says
"$A^T$ scales like $A$, but rotates like $A^{-1}$."
So, $A^T$ has more in common with $A^{-1}$ then it has in common with $A$.

Not all matrices have an inverse.
If the inverse does not exist, the plot can still be made, replacing $A^{-1}$ with $A^{\dagger}$ and $\Sigma^{-1}$ with $\Sigma^{\dagger}$.
$A^{\dagger}$ is the generalized inverse of $A$.

Some more detail can be found on: www.heavisidesdinner.com

relation between SVD of <span class=$A$, $A^T$ and $A^-1$">