Basis independence in Quantum Mechanics

The state is a vector in the Hilbert space of the Hamiltonian, which gives it a natural basis in terms of the eigenvectors; distinct eigenvalues then exist in distinct (orthogonal) subspaces - for degenerate values the subspaces are larger, but they are still distinct from all others. Clearly this situation gives many advantages in analysis.

However, this representation, though natural, and dependent only upon the spectral decomposition of the Hamiltonian, is not unique. The experimentalist will prefer to establish a basis which corresponds to his experimental setup! Chosing a different basis will change all of the coordinates, but does not change the states.

To make this clear recall that the state vectors of our Hilbert space can also be viewed a as rays, which are similar to the geometric rays of Euclidean geometry. Imagine the ray first, then superimpose a grid upon it - as you rotate the grid about the origin of the ray the intersections of the grid with the ray define the coordinates for that basis. The coordinates change, but the ray doesn't change. For ordinary geometry the relationships between two rays (angles, projections) don't change, so it is clear that some relationships between the coordinates are fixed - these are properties of a metric space.

Our Hilbert space is a normed space - distances don't mean anything, but a normalized state vector always has a length of 1, even when you change the basis; nor does the length change under unitary operators - hence their name.

All of this becomes clear in a good linear algebra course.


$\newcommand{\real}{\mathbb R}\newcommand{\field}{\mathbb F}\newcommand{\cx}{\mathbb C}\newcommand{\ip}[2]{\left< #1,#2\right>}$We need to dive into mathematics of vector spaces and inner products in order to understand what a vector means and what is it mean to take a scalar product of two vectors. There is a long post ahead so bear with me even though you may think that the maths is too abstract and has nothing to do with QM. In fact without understanding the abstract concept of vector spaces it is almost impossible to understand QM thoroughly.

Vector Spaces

Let's turn to linear algebra and think for a second what we mean by a vector space $V$ on a field $(F,+ , \cdot , 0 ,1)$. It consists of a set $V$, a field $F$ a scalar multiplication $ \odot : F \times V \to V$, $(\lambda , v) \mapsto \lambda \odot v$ and vector addition $ \oplus : V \times V \to V $, $ (v,w) \mapsto v \oplus w $, which has the following properties:

  • $(V, \oplus, \tilde 0)$ is an abelian group
  • $\forall v,w \in V$ and $\forall \lambda \in F: \; \; \lambda \odot (v \oplus w) = \lambda \odot v \oplus \lambda \odot w$
  • $ \forall v \in V$ and $\forall \lambda, \mu \in F: \; \; (\lambda + \mu) \odot v = \lambda \odot v \oplus \mu \odot v $
  • $\forall v \in V: \; \; 1 \odot v = v$

Note that the summation in the third axiom takes place on $F$ on the left hand side and on $V$ on the right hand side. So a vector space can really be anything, with these properties. Normally people don't distinguish between $\odot$ and $\cdot$ or $\oplus$ and $+$ because they show similar properties. For the sake of being absolutely clear I'll nevertheless make this distinction. If it bothers you replace $\oplus \to +$ and $\odot \to \cdot$ in your head. The elements of the set $V$ is called vectors and I didn't mention a thing called basis to define them.

Basis and Generators

The basis of a vector is then defined to be a set $B \subseteq V$ which has the following two properties:

$$\forall B' \subseteq B\; \text{ with }\; \# B' < \infty: \; \; \bigoplus_{b \in B'} \lambda_b \odot b = \tilde 0 \implies \lambda_b = 0$$

where $\lambda_b \in F$ and $$ \forall v \in V, \; \exists B' \subseteq B \; \text{ with }\; \# B' < \infty:\;\; v = \bigoplus_{b \in B'} \lambda_b \odot b$$

for some $\lambda_b \in F$.

A set $U \subseteq V$ with the first property is called a linear independent set and a set $T \subseteq V$ with the second property is called the generator of the vector space. You have namely

$$B \text{ is a basis} :\iff B \text{ is a generator of the vector space and linear independent}$$

In a linear algebra class it is shown that all the bases of $V$ has the same cardinality. We define there for the dimension of $V$ to be $\mathrm{dim}(V)=\#B$ for $B$ being a basis of $V$.

Representation of a vector as tuple

If your vector space is finite dimensional ie if $\mathrm{dim}(V)< \infty$ then you can simplify the above conditions as:

$$\bigoplus_{b\in B} \lambda_b \odot b = \tilde 0 \; \text{and} \; \forall v\in V: \; \; v= \bigoplus_{b\in B} \lambda_b \odot b$$

In a linear algebra course it is taught that every vector space (finite or infinite) $V$ has a basis and for a chosen basis the scalars $\lambda_b$ is unique. A basis is called an ordered basis if you order the elements of your basis in a tuple. If your vector space is finite dimensional, and $B$ is an order basis you can define a vector space isomorphism $\iota_B$

$$\iota_B : V \to F^n, \; v = \bigoplus_{b \in B} \lambda_b \odot b \mapsto (\lambda_{b_1}, \dots, \lambda_{b_n}) $$

where $b_i \in B$ $\forall i = 1 \dots n$ and $n = \#B$. There you can see the components of the vector $b$ as the numbers $\lambda_{b_i} \in F$. Note that even though the representation of a vector $v \in V $ as a n-tuple is unique for a given basis, there is no unique representation for the vector $v \in V$ for all bases. As an example will consider the set $V:=\real^2$ as a $F:=\real$ vector space, where $\oplus$ and $\odot$ are defined in the following fashion. I'll denote the elements of $V$ with square brackets and their representations with normal brackets to avoid confusion. $$[a,b] \oplus [c,d] = [a+c, b+d]$$ and $$\lambda \odot [a,b] = [\lambda \cdot a, \lambda \cdot b]$$ for all $a,b,c,d \in F$ and $[a,b],[c,d] \in V$. I leave you to check the vector space axioms. Note that whilst writing $[a,b] \in \real^2$ I'm not referring to any basis. It is merely a definition of being in $\real^2$ that let's me write $[a,b]$.

Now let $B=\{b_1,b_2\}$ with $b_1 = [1,0]$ and $b_2=[1,0]$. Note that $\lambda \odot b_1 \oplus \mu \odot b_2 = [0,0] \implies \mu= \lambda =0 $ so $B$ is linear independent. Furthermore $[a,b] = a \odot b_1 + b \odot b_2$, $\forall [a,b] \in V $ as you can easily check. Thus the isomorphism $\iota_B$ well-defined and you get

$$\iota_B([a,b]) = (a,b) \;\; \forall [a,b] \in V$$

However for another Basis $C=\{c_1, c_2\}$ with $c_1 =[-1,0]$ and $c_2=[0,2]$ you get:

$$\iota_C([a,b])=(-a,b/2) \;\; \forall [a,b] \in V$$

as you can easily check.

Note that in this example your vectors are $[a,b]\in \real^2$, which exist as an element of $\real^2$ independent of any basis.

Another example of a $n$-dimensional vector space could be the solutions of a homogeneous linear differential equation of degree $n$. You should play with it choose some basis and represent them as tuples. Note that in this case your vectors are functions, which solve that particular differential equation.

In the case of a infinite dimensional vector space things get a little bit difficult but to understand the basis concept finite dimensional vector spaces are the best way to go.

Inner/Scalar product

Let $V$ be a vector space, and $\field = \{\real, \cx\}$ be either real or complex numbers. Furthermore I'll now replace $\otimes \to +$ and $\odot \to \cdot$ to make my point more clear. A scalar product is a function $\ip \cdot \cdot: V \times V \to \field$, $(v,w) \mapsto \ip vw$ with following properties:

  • $\forall v_1,v_2,w \in V, \; \forall \lambda \in \field: \;\; \ip{v_1 + \lambda v_2} w = \ip{v_1} w + \lambda^* \ip{v_2}w $
  • $\forall v,w_1,w_2 \in V, \; \forall \lambda \in \field: \;\; \ip{v}{ w_1 + \lambda w_2} = \ip{v}{ w_1} +\lambda \ip{v}{w_2}$
  • $\forall v,w \in \field: \;\;\ip vw = \ip wv^*$
  • $\forall v \in V\setminus\{0\}:\;\; \ip vv \in \real_{>0}$

Again note that $\ip \cdot\cdot$ could be any function with these properties. Furthermore I didn't need a basis of $V$ to define the scalar product, so it cannot possibly be dependent of the basis chosen. A vector space with an scalar product is called an inner product space.

As an example take the polynomials of degree $\leq n$, defined on an interval $I\subset\real$. You can easily show that $$V:=\{P:I \to \real \, | \, P \text{ is a polynomial and degree of } P \leq n\}$$ is a vector space. Furthermore we define

$$\forall P,Q \in V\;\; \ip PQ_1 := \int_0^1 P(x) \cdot Q(x) \, \mathrm d x $$

Note that this function is a valid scalar product on $V$. However I can also define $$\forall P= p_n x_n + \dots + p_0 , \,Q = q_n x_n + \dots q_0 \in V \;\; \ip PQ_2 := p_nq_n + \dots + p_0 q_0 $$

which is also a nice definition for a scalar product. Again I'm not referring to any basis as I write $P= p_n x_n + \dots + p_0$. It is just a definition of being in $V$. It is clear that there is no unique scalar product defined on a particular vector space.

Representation of $\ip \cdot \cdot$ wrt a Basis

Now if I choose a ordered basis $B$ of $V$ then I can simplify my life a little bit. Let $v\in V$ with $v= \sum_{i=1}^n v_i b_i$ and $w\in V$ with $w = \sum_{j=0} ^n w_i b_i$. Then I can write:

$$\ip vw = \ip{\sum_{i=1}^n v_i b_i}{\sum_{j=0} ^n w_j b_j} = \sum_{i=1}^n\sum_{j=1}^n v_i^* w_j \ip {b_i} {b_j}$$

You see now that this looks like a matrix product of $\iota_B(v)^\dagger \cdot A \cdot\iota_B(w) $, where $A=(a_{ij}):=\ip{b_i}{b_j}$. Note that this representation of $\ip\cdot \cdot$ depends of the basis chosen. Having chosen a basis however, you can just do the matrix multiplication to get the scalar product.

For the above example with polynomials and the inner product $\ip \cdot \cdot _2$, if you choose the basis to be $b_i = x^i$ then you get for $A = \mathrm{diag}(1, \dots, 1)$

Hilbert Space

A Hilbert space is an inner product space and as a metric space it is complete with the metric induced by the scalar product. This means your norm is defined by $\lVert v \rVert := \sqrt{\ip vv}$ and the metric is defined by $d(v,w):= \lVert v-w \rVert$. For what it means for a space to be complete you can check the Wikipedia article.

Upshot

There is very clear distinction between the vector itself as an element of the set $V$ and its representation as a tuple in $F^n$ with respect to a basis $B$ of $V$. The vector itself exist regardless of any basis, whereas the representation of the vector is basis dependent. The same goes for the scalar product and its representation.

In physics one automatically assumes the standard basis for $\real^n$, which has vectors with zeros everywhere except for one component, that is equal to one, and calculates everything in representation of vectors without specifying any basis, which in turn creates the illusion that vectors are its components and a vector without its a component is unimaginable.

Since a vector space could almost be anything it is really hard to imagine what a vector is without referring to its components. The best way to do that in my opinion is to accept the fact that a vector is just an element the set $V$ and nothing more. For example if you write

$$\left|\psi \right> = c_1 \left|+ \right> + c_2 \left| - \right>$$

you barely say that $\left| \psi \right>$ can be written as a linear combination of $\left| \pm \right>$. So this does not refer to any basis. However the moment you write

$$\left|\psi \right> = c_1 \left|+ \right> + c_2 \left| - \right> = \begin{pmatrix} c_1 \\ c_2 \end{pmatrix}$$

first of all a mathematician dies because $\left|\psi \right> = \begin{pmatrix} c_1 \\ c_2 \end{pmatrix}$ doesn't make any sense as $\left|\psi \right> \in \mathcal H$ and $\begin{pmatrix} c_1 \\ c_2 \end{pmatrix} \in \cx^2$. Regardless of that $\begin{pmatrix} c_1 \\ c_2 \end{pmatrix}$ refers to an ordered basis of $\mathcal H$ namely the basis $B=\{\left|+ \right>, \left|- \right> \}$ and this representation depends on the basis that you've chosen.


Suppose $\left\{\left|e_i\right\rangle|i\in I\right\}$ is an orthonormal basis of a Hilbert space $\mathcal{H}$, viz. $\left\langle e_i |e_j\right\rangle =\delta_{ij}$. Then the identity operator from $\mathcal{H}$ to $\mathcal{H}$ can be written as an outer product $$\mathbb{I}=\Sigma_{i\in I}\left|e_i\right\rangle\left\langle e_i\right|\quad\left(\ast\right),$$because kets then satisfy $\left|\psi\right\rangle=\Sigma_{i\in I}\left|e_i\right\rangle\left\langle e_i|\psi\right\rangle$, and similarly with bras.

If $\left\{\left|f_j\right\rangle|j\in I\right\}$ is another orthonormal basis, we may write $\left|e_i\right\rangle = \Sigma_{j\in I} U_{ij}\left|f_j\right\rangle$ with $U_{ij}$ a unitary matrix so that $\left|e_i\right\rangle = \Sigma_{k\in I}\delta_{ik}\left|e_k\right\rangle,\,\left|f_j\right\rangle = \Sigma_{l\in I}\delta_{jl}\left|f_l\right\rangle$. Hence $\left\langle e_i\right| = \Sigma_{j\in I}U_{ij}^\ast\left\langle f_j\right|=\Sigma_{j\in I}\left(U^\dagger\right)_{ji}\left\langle f_j\right|$. Finally, $$\Sigma_{i\in I}\left|e_i\right\rangle\left\langle e_i\right|=\Sigma_{i,\,j,\,k\in I}\left(U^\dagger\right)_{ki}U_{ij}\left|f_j\right\rangle\left\langle f_k\right|=\Sigma_{j,\,k\in I}\delta_{kj}\left|f_j\right\rangle\left\langle f_k\right|=\Sigma_{j\in I}\left|f_j\right\rangle\left\langle f_j\right|,$$ so our two "identity operators" match. Thus either can be used to compute a ket as per Eq. $\left(\ast\right)$; the basis choice doesn't matter.