A simple explanation of eigenvectors and eigenvalues with 'big picture' ideas of why on earth they matter

To understand why you encounter eigenvalues/eigenvectors everywhere, you must first understand why you encounter matrices and vectors everywhere.

In a vast number of situations, the objects you study and the stuff you can do with them relate to vectors and linear transformations, which are represented as matrices.

So, in many many interesting situations, important relations are expressed as $$\vec{y} = M \vec{x}$$ where $\vec{y}$ and $\vec{x}$ are vectors and $M$ is a matrix. This ranges from systems of linear equations you have to solve (which occurs virtually everywhere in science and engineering) to more sophisticated engineering problems (finite element simulations). It also is the foundation for (a lot of) quantum mechanics. It is further used to describe the typical geometric transformations you can do with vector graphics and 3D graphics in computer games.

Now, it is generally not straight forward to look at some matrix $M$ and immediately tell what it is going to do when you multiply it with some vector $\vec{x}$. Also, in the study of iterative algorithms you need to know something about higher powers of the matrix $M$, i.e. $M^k = M \cdot M \cdot ... M$, $k$ times. This is a bit awkward and costly to compute in a naive fashion.

For a lot of matrices, you can find special vectors with a very simple relationship between the vector $\vec{x}$ itself, and the vector $\vec{y} = Mx$. For example, if you look at the matrix $\left( \begin{array}{cc} 0 & 1 \\ 1 & 0\end{array}\right)$, you see that the vector $\left(\begin{array}{c} 1\\ 1\end{array}\right)$ when multiplied with the matrix will just give you that vector again!

For such a vector, it is very easy to see what $M\vec{x}$ looks like, and even what $M^k \vec{x}$ looks like, since, obviously, repeated application won't change it.

This observation is generalized by the concept of eigenvectors. An eigenvector of a matrix $M$ is any vector $\vec{x}$ that only gets scaled (i.e. just multiplied by a number) when multiplied with $M$. Formally, $$M\vec{x} = \lambda \vec{x}$$ for some number $\lambda$ (real or complex depending on the matrices you are looking at).

So, if your matrix $M$ describes a system of some sort, the eigenvectors are those vectors that, when they go through the system, are changed in a very easy way. If $M$, for example, describes geometric operations, then $M$ could, in principle, stretch and rotate your vectors. But eigenvectors only get stretched, not rotated.

The next important concept is that of an eigenbasis. By choosing a different basis for your vector space, you can alter the appearance of the matrix $M$ in that basis. Simply speaking, the $i$-th column of $M$ tells you what the $i$-th basis vector multiplied with $M$ would look like. If all your basis vectors are also eigenvectors, then it is not hard to see that the matrix $M$ is diagonal. Diagonal matrices are a welcome sight, because they are really easy to deal with: Matrix-vector and Matrix-matrix multiplication becomes very efficient, and computing the $k$-th power of a diagonal matrix is also trivial.

I think for a "broad" introduction this might suffice?


This made it clearer for me: Khan Academy - Introduction to Eigenvalues and Eigenvectors I often find it easier to understand via illustration like this.


Layman explanation for Eigenvectors and Eigenvalues.

Find the nearest pen/pencil. Roll the pen between your palms such that when it spins, the axis of rotation matches the same vector that the pen points. Now assume we have a 3D simulation that rotates the pen in this way. In the simulation of the rotated pen, the computer has to calculate the position of each point within the pen. The rotation is performed by a 3D transformation matrix that when multiplied by the matrix of the points in the pen, defines precisely how that pen will rotate on the 3d cartesian plane. The pencil is just a 3D matrix. There is another matrix that when multiplied, yields the correct rotation around the axis of rotation.

In this little pen rolling simulation, you have the Matrix for the location of particles in the pen, and you have the matrix that says exactly how to do the 3D transform to make it rotate. What if you wanted to know the axis of rotation of the pen, given only the pen and the transform. You would not know how to do it.

Enter stage left the following equation: np.dot(MATRIX, vector) == multiple * vector, which would translate to this in our little story:

np.dot(PEN, TRANSFORM) == SOME_NUMBER * TRANSFORM

This equation asserts that if you can find a number times the transform that is the same as the dot product between the pen and the transform, that yields the axis of rotation. You can find out which way the pen is pointing given only how the particles in the pen spin. It's a clever trick to isolate variables and discover new truths.

OK but why

This methodology is useful because discovering how a Matrix times vector produces the same result as a scalar times a vector helps us find the axis of rotation that minimizes variance. Enter stage right principal component analysis algorithm. An algorithm that reduces dimensionality while minimizing the reduction of information.

The method you use to discover Eigenvectors are the methodology that Principal Component Analysis uses to find the new basis vector for the points scattered on the coordinate plane. Imagine we have two input features. Tire age and tire wear. We plot these on a scatterplot and it makes a diagonal straight line.

Eigenvectors and eigenvalues play a role when we want to compress tire age and tire wear since they are so plain. Imagine a 2d scatterplot with points on a straight diagonal line. This 2 dimensional straight line can be compressed into one dimension without much data loss. So find the eigenvector of the points, that is the axis of rotation, so imagine taking a pencil and rolling it between your palms, it spins along its axis of rotation. The eigenvector is that vector of axis of rotation of minimum variance. You can rebase the points around that vector, and you've compressed 2 dimensions to one dimension. We're happy because we've reduced data size but not decreased information gain/variance.

Reset the cartesian plane to put the x axis along the Eigenvector, and you have a recipe for compressing a 2 dimensional object to one dimensional object, while also preserving all of the information contained in the 2d object.

Now that you have this algorithm and procedure in mind, you can checkout 3Blue1Brown's explanation and hopefully the beauty of this tool will materialize. https://www.youtube.com/watch?v=PFDu9oVAE-g If it does click, you'll be able to explain how finding the Eigenvector is a great tool for helping address the curse of dimensionality as it applies to neural networks and other supervised learning algorithms.

Even simpler words, if you're still confused:

Surely you have thrown and caught a ball in midair before. Have you wondered how your brain is able to send instructions to the muscles in your shoulder, biceps and forearm to throw and catch a ball in midair even though you are unable to describe the calculus III of multiple variables, of what must be happening to perform these miracles? Eigenvectors and Eigenvalues are structures that your brain uses in order to correctly access the incoming trajectory of the ball, given only 2D frames over time. Your mind is able to untangle 2 dimensions into a 3 dimensions correctly. Your brain is about 2 billion years old and this functionality is present even in rodents and insects, so these principles are ancient. Your brain furiously processes the equation I described above. When a 3D array of particles, transformed by a matrix, equals a scalar times a numeric multiple, that satisfied equivalency is the target trajectory. Over the 2.1 billion years or so, that nanotechnology in your head has stumbled across and harnessed the powerful concept of Eigenvectors and Eigenvalues. It's used all over, even during the process of learning and model compression during sleep. These Eigen-concepts are some of the core algorithms for why machine learning algorithms work. Tesla's self driving car team was able to harness the mathematics of these Eigen-transforms and apply them to the mathematics of optics and camera calibration, to transform 2D visual images into new ground truths for the position, distance, orientation, direction, velocity, and acceleration of vehicles and objects in the vicinity. These Eigen-structures are the tools of computer vision developers doing truth extraction from live photographs, which from the computers perspective is just a 2D matrix of numbers representing color.