PCA and image compression
When computing the PCA of some matrix, the eigenvectors are orthogonal because we symmetrize the matrix during the process. In general, eigenvectors of a matrix are not necessarily orthogonal; however, this is a property that holds for symmetric matrices.
The singular value decomposition of a matrix $A$ can be written as
$$A = U \Sigma V^T$$
where $U$ and $V$ are orthonormal. In practice, we don't often compute this because of numerical issues.
Instead, we might look at $$A^TA = V\Sigma^T U^T U \Sigma V^T = V\Sigma^2 V^T.$$
Equivalently, if we look at $A^TA$ as a symmetric, diagonalizable matrix, we can compute its eigendecomposition as $A^TA = W\Lambda W^T$ and we know that the eigenvectors found in the columns of $W$ are orthogonal due to symmetry.
The link between SVD and PCA is recognizing that these are the same things.
You may have more luck at finding mathematical facts if you use the other name of PCA, which is SVD, singular value decomposition. Still another name for the same mathematical idea is Karhunen-Loeve transformation.
The properties of PCA that you asked about result from the properties of the spectral decomposition of symmetric matrices. They always have real eigenvalues, eigenspaces to different eigenvalues are orthogonal, inside an eigenspace an orthogonal basis of eigenvectors can be found, so that the transformation matrix can always be constructed to be orthogonal (orthonormal columns).
The first components should show common features of the images, like the average over all images.