Paradox about the Stern-Gerlach experiment with $B=0$, $\nabla B\ne 0$

Steve's intuition for the measurement operator is in the right direction, but it's not quite the right interaction potential. Getting that right is simple, but it requires the complex task of taking the naive answer to its ultimate consequences and then understanding what it's really saying. The hamiltonian is the same as always: a direct coupling between the spin and the magnetic field: $$ \hat H_\mathrm{int}=\mu \mathbf B(\hat{\mathbf r})\cdot \hat{\boldsymbol\sigma}. $$ However, we know more about the magnetic field in this situation - namely, that it varies linearly with position - and we need to bring that to bear. Thus, we know that \begin{align} \mathbf B (\mathbf r) & = \mathbf B (\mathbf 0) +\mathbf r\cdot\nabla\mathbf B(\mathbf 0) +\mathcal O(r^2) \\ & \approx -x\frac{\partial B_x}{\partial x}(\mathbf 0)\,\hat{\mathbf e}_x+y\frac{\partial B_y}{\partial y}(\mathbf 0)\,\hat{\mathbf e}_y \\ & =k\left(-x\hat{\mathbf e}_x+y\hat{\mathbf e}_y\right), \end{align} to linear order, since you've specified everything else to be zero. If we put that back into our hamiltonian, then, we get $$ \hat H_\mathrm{int}= k\mu \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right). $$ It's important to note here that we have not yet specified the spin of the system, so we do not know what representation of $\mathrm{SU}(2)$ the spin matrices are in - so they could be ten-by-ten matrices for all we know. Nevertheless, if we assume a spin-1/2 representation, one way to write down that hamiltonian is in the $\hat{\sigma}_z$ basis, where it reads $$ \hat H_\mathrm{int} = k\mu \begin{pmatrix} 0 & -\hat{x}-i\hat{y} \\ -\hat{x} +i\hat{y} & 0 \end{pmatrix}. $$ However, you can also choose a representation along the $\hat{\sigma}_x$ basis, say, in which case it reads $$ \hat H_\mathrm{int} = k\mu \begin{pmatrix} -\hat{x}& \phantom{+}\hat{y} \\ \phantom{+}\hat{y}& +\hat{x} \end{pmatrix}, $$ with a complete equivalence between the two.


To get some intuition for why this is right, it's helpful to step back a bit and pretend that we only had one of the two operators acting on our system, so the interaction hamiltonian reads $$ \hat H_\mathrm{int}= k\mu\hat{x}\hat{\sigma}_x, $$ as we like to use in the standard case. What does this operator do? Well, assuming that the measurement is impulsive, what happens is essentially that whatever our initial state $|\psi(0)\rangle$ was, after the measurement time $\tau$ it gets transferred to $$ |\psi(\tau)\rangle = e^{-i\tau\hat{H}_\mathrm{int}/\hbar} |\psi(0)\rangle = e^{-i\tau k\mu\hat{x}\hat{\sigma}_x/\hbar} |\psi(0)\rangle. $$ That's a lot of operators inside an exponential, but it's actually rather simple to analyze if we assume that our initial state was in a $\hat{\sigma}_x$ eigenstate and it had definite momentum $p_x$, in which case the measurement acts as $$ |\psi(\tau)\rangle = e^{-i\tau k\mu\hat{x}\hat{\sigma}_x/\hbar} |\psi(0)\rangle = e^{-i\tau k\mu\hat{x}\sigma_x/\hbar} |p_x,\sigma_x\rangle = \left|p_x + k\mu\tau \, \sigma_x,\sigma_x\right\rangle, $$ i.e. the $\hat{\sigma}_x$ in the exponent becomes a number, and $e^{-i\tau k\mu\hat{x}\sigma_x/\hbar}$ is just a displacement operator on momentum space. Thus, over the course of the measurement, the apparatus has exerted an impulse $\frac{k\mu \tau}{\hbar}\sigma_x$ on the particle, and off it shoots in that direction.


That's the minimal example, though, so now we need to look at our full hamiltonian with the tangle of $-\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y$ in it. This is a bit more complicated, because there is now no way to initialize a spin state that will reduce the spatial dynamics to an effectively-spinless interaction, as in the previous case.

So, let's have a look at what happens in the general case, where the interaction with the magnet is now implemented by the unitary $$ \hat{U}(\tau) = \exp\mathopen{}\left( -i \frac{k\mu \tau}{\hbar}\left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right) \right)\mathclose{} . $$ This is a bit of an intimidating beast but at least in the spin-1/2 case it's not that terrible to handle, because the big correlated matrix inside the exponential squares to a simple matrix: \begin{align} \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right)^2 &= \hat{x}^2\hat{\sigma}_x^2 -\hat{y}\hat{x}\hat{\sigma}_y\hat{\sigma}_x -\hat{x}\hat{y}\hat{\sigma}_x\hat{\sigma}_y + \hat{y}^2\hat{\sigma}_y^2 \\ &= \hat{x}^2\hat{\sigma}_x^2 + \hat{y}^2\hat{\sigma}_y^2 \\ &= \hat{x}^2 + \hat{y}^2, \end{align} where the first step is because spin operators anticommute (valid for arbitrary spin $j$), and the second step uses $\hat{\sigma}_x^2=\hat{\sigma}_y^2 = 1$, which is specific to spin 1/2. (For higher spins there are still useful things you can say, but they're technical and more complicated.)

This identity is useful because it lets us split any arbitrary power of the exponentiated matrix into two simple cases: \begin{align} \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right)^{2n} &= \left( \hat{x}^2 + \hat{y}^2\right)^n, \quad \text{and}\\ \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right)^{2n+1} &= \left( \hat{x}^2 + \hat{y}^2\right)^n \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right), \end{align} which then lets us split the matrix exponential into two simple terms: \begin{align} \hat{U}(\tau) &= \exp\mathopen{}\left( -i \frac{k\mu \tau}{\hbar} \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right) \right)\mathclose{} \\ & = \sum_{n=0}^\infty \frac{1}{n!} \left(-i\frac{k\mu \tau}{\hbar}\right)^n\left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right)^n \\ & = \sum_{n=0}^\infty \frac{(-1)^n}{(2n)!} \left(\frac{k\mu \tau}{\hbar}\right)^{2n}\left( \hat{x}^2 + \hat{y}^2\right)^n \\ & \qquad -i \sum_{n=0}^\infty \frac{(-1)^n}{(2n+1)!} \left(\frac{k\mu \tau}{\hbar}\right)^{2n+1} \left( \hat{x}^2 + \hat{y}^2\right)^n \left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right) \\ & = \cos\left(\frac{k\mu \tau}{\hbar} \sqrt{\hat{x}^2 + \hat{y}^2}\right) -i\sin\left(\frac{k\mu \tau}{\hbar} \sqrt{\hat{x}^2 + \hat{y}^2}\right) \frac{-\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y }{\sqrt{\hat{x}^2 + \hat{y}^2}} . \end{align} For notational convenience, hereafter I set $\gamma = k\mu\tau/\hbar$ and $r=\sqrt{x^2+y^2}$.

Is this expression an improvement? For sure, it looks like it might not be, but it's really broken things down into really rather few important pieces, and those have relatively simple action: the operator $\hat{x}^2 + \hat{y}^2$ simply acts as the laplacian in momentum space, so it doesn't shift any directions, and there are now only two bits that act directly on the spin sector, and they both act directly instead of in some entangled exponential. But I digress.

So, how does this act on an actual spin-momentum state? Well, it's going to be complicated, that's for sure, and you're going to end up with some pretty complex entanglement between the momentum and the spin, which will then need to be traced out once the particle reaches the screen. That's going to make things complicated, but let's give it a shot. To do that, we describe the state of the spatial part of the system after the interaction, which is described by the density matrix $$ \hat\rho = \mathrm{Tr}_S\left[\hat{U}(\tau)|\psi(0)\rangle\langle \psi(0)| \hat{U}(\tau)^\dagger\right], $$ where we're allowed to trace out the spin since we won't be addressing it again. Here I don't see much of a substitute for diving straight in and expanding things out: assuming that the initial state $|\psi(0)\rangle = |\mathbf p\rangle|s\rangle$ had a well-defined initial momentum and it was in some initial spin state $|s\rangle$, the position representation of our reduced density matrix reads \begin{align} \langle \mathbf r|\hat\rho|\mathbf r'\rangle &= \mathrm{Tr}_S\left[\langle \mathbf r|\hat{U}(\tau)|\psi(0)\rangle\langle \psi(0)| \hat{U}(\tau)^\dagger|\mathbf r'\rangle\right] \\&= \mathrm{Tr}_S\left[ \left(\cos\left(\gamma r\right) -\frac{i}{r}\sin\left(\gamma r\right) \left(-{x}\hat{\sigma}_x + {y}\hat{\sigma}_y \right)\right) \langle \mathbf r|\mathbf p\rangle |s\rangle\langle s| \right. \\ & \qquad \qquad \qquad \left. \langle\mathbf p|\mathbf r'\rangle \left(\cos\left(\gamma r'\right) +\frac{i}{r'}\sin\left(\gamma r'\right) \left(-{x}'\hat{\sigma}_x + {y}'\hat{\sigma}_y \right)\right) \right] \\&= \mathrm{Tr}_S\left[ \left(\cos\left(\gamma r\right) -\frac{i}{r}\sin\left(\gamma r\right) \left(-{x}\hat{\sigma}_x + {y}\hat{\sigma}_y \right)\right) e^{i\mathbf p\cdot\mathbf r/\hbar} |s\rangle\langle s| \right. \\ & \qquad \qquad \qquad \left. e^{-i\mathbf p\cdot\mathbf r'/\hbar} \left(\cos\left(\gamma r'\right) +\frac{i}{r'}\sin\left(\gamma r'\right) \left(-{x}'\hat{\sigma}_x + {y}'\hat{\sigma}_y \right)\right) \right] %\\&= %\mathrm{Tr}_S\left[ %\left(\cos\left(\gamma r\right) %-\frac{i}{r}\sin\left(\gamma r\right) %\left(-{x}\hat{\sigma}_x + {y}\hat{\sigma}_y \right)\right) %e^{i\mathbf p\cdot\mathbf r/\hbar} %|s\rangle\langle s| %e^{-i\mathbf p\cdot\mathbf r'/\hbar} %\left(\cos\left(\gamma r'\right) %+\frac{i}{r'}\sin\left(\gamma r'\right) %\left(-{x}'\hat{\sigma}_x + {y}'\hat{\sigma}_y \right)\right) %\right] \\&= \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \mathrm{Tr}_S\left[ |s\rangle\langle s| \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x} \right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}' \right) \mathrm{Tr}_S\left[ \hat{\sigma}_x|s\rangle\langle s|\hat{\sigma}_x \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \mathrm{Tr}_S\left[ \hat{\sigma}_y|s\rangle\langle s|\hat{\sigma}_y \right] \\ & \quad + \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \mathrm{Tr}_S\left[ |s\rangle\langle s|\hat{\sigma}_y \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x}\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \mathrm{Tr}_S\left[ \hat{\sigma}_x|s\rangle\langle s| \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}'\right) \mathrm{Tr}_S\left[ \hat{\sigma}_y|s\rangle\langle s|\hat{\sigma}_x \right] \\ & \quad + \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}'\right) \mathrm{Tr}_S\left[ |s\rangle\langle s|\hat{\sigma}_x \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x}\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \mathrm{Tr}_S\left[ \hat{\sigma}_x|s\rangle\langle s|\hat{\sigma}_y \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \mathrm{Tr}_S\left[ \hat{\sigma}_y|s\rangle\langle s| \right] \end{align} when you expand everything out. This is a mess for sure, but we can start to clean at least some parts up by calculating the trivial traces, giving \begin{align} \langle \mathbf r|\hat\rho|\mathbf r'\rangle &= \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x} \right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}' \right) \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \\ & \quad + \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \mathrm{Tr}_S\left[ |s\rangle\langle s|\hat{\sigma}_y \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x}\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \mathrm{Tr}_S\left[ |s\rangle\langle s|\hat{\sigma}_x \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}'\right) \mathrm{Tr}_S\left[ i|s\rangle\langle s|\hat{\sigma}_z \right] \\ & \quad + \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}'\right) \mathrm{Tr}_S\left[ |s\rangle\langle s|\hat{\sigma}_x \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x}\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \mathrm{Tr}_S\left[ -i|s\rangle\langle s|\hat{\sigma}_z \right] \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \mathrm{Tr}_S\left[ |s\rangle\langle s|\hat{\sigma}_y \right] , \end{align} and then you can start putting in some definite spin states. For the sake of definiteness, let's specialize to $|s\rangle$ being an eigenstate of $\hat{\sigma}_x$, in which case $\mathrm{Tr}_S\left[|s\rangle\langle s|\hat{\sigma}_x\right]=s$, and the other two remaining traces are zero. (For a more general state, with spin $+1/2$ along some axis $\mathbf n$, the traces would return the cartesian components $\mathrm{Tr}_S\left[|s\rangle\langle s|\hat{\sigma}_j\right]=n_j$ of the axis.)

Under those assumptions, then, we have \begin{align} \langle \mathbf r|\hat\rho|\mathbf r'\rangle &= \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-{x} \right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-{x}' \right) \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \\ & \quad + \frac{-i}{r}\sin\left(\gamma r\right)\left(-s{x}\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \cos\left(\gamma r'\right) \\ & \quad + \cos\left(\gamma r\right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right)\left(-s{x}'\right) , \end{align} and we can start asking just how much we can simplify this expression. The ideal goal would be to have a pure momentum state $\langle \mathbf r|\hat\rho|\mathbf r'\rangle = e^{i \tilde{\mathbf p}\cdot\mathbf r/\hbar} e^{-i \tilde{\mathbf p}\cdot\mathbf r'/\hbar} $ with some shifted momentum $\mathbf p$, but we've lost a good deal of coherence to the entanglement with the spin, so we need to see what we can get.

To explore this, we can some rearrangement on our density matrix, and we can actually get pretty far: \begin{align} \langle \mathbf r|\hat\rho|\mathbf r'\rangle &= \left( \cos\left(\gamma r\right) + \frac{-i}{r}\sin\left(\gamma r\right)\left(-s{x} \right) \right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \left( \cos\left(\gamma r'\right) + \frac{i}{r'}\sin\left(\gamma r'\right)\left(-s{x}'\right) \right) \\ & \qquad + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' . \end{align} If we didn't have that ugly last term, and we could ignore the fact that things are in terms of $r$ instead of $x$, then we'd be golden, and we'd have a definite shift in the momentum, as in the previous case. However, that's little less than wishful thinking, and instead we need to face up to the fact that over momentum space we have nothing close to a pure state.

It might not look like it, but we've actually come pretty far here, and we're not too far from being able to say useful things about what happens at the screen. What we really want, in fact, is the momentum representation of the density matrix in this state, whose diagonal values are what ultimately gets detected on the screen, and the only thing we need to get there is to take the Fourier transform of our current version of the result. Thus, again, let's dive right back into it: \begin{align} \langle \mathbf q |\hat\rho|\mathbf q'\rangle & = \int \mathrm d\mathbf r\,\mathrm d\mathbf r' \langle \mathbf q |\mathbf r\rangle \langle \mathbf r |\hat\rho|\mathbf r'\rangle \langle \mathbf r' |\mathbf q'\rangle \\ & = \int \left[ \left( \cos\left(\gamma r\right) + \frac{-i}{r}\sin\left(\gamma r\right)\left(-s{x} \right) \right) e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \left( \cos\left(\gamma r'\right) + \frac{i}{r'}\sin\left(\gamma r'\right)\left(-s{x}'\right) \right) \right. \\ & \qquad \left. + \frac{-i}{r}\sin\left(\gamma r\right){y} e^{i\mathbf p\cdot\mathbf r/\hbar} e^{-i\mathbf p\cdot\mathbf r'/\hbar} \frac{i}{r'}\sin\left(\gamma r'\right){y}' \right] e^{-i\mathbf q\cdot\mathbf r} e^{i\mathbf q'\cdot\mathbf r'} \mathrm d\mathbf r\,\mathrm d\mathbf r' \\ & = \int \left( \cos\left(\gamma r\right) + is\frac{x}{r}\sin\left(\gamma r\right) \right) e^{i(\mathbf p-\mathbf q)\cdot\mathbf r} \mathrm d\mathbf r \\ & \qquad \times \int \left( \cos\left(\gamma r'\right) - is\frac{x}{r}\sin\left(\gamma r\right) \right) e^{-i(\mathbf p-\mathbf q')\cdot\mathbf r'} \mathrm d\mathbf r' \\ & \qquad + \int -i\frac{y}{r}\sin\left(\gamma r\right) e^{i(\mathbf p-\mathbf q)\cdot\mathbf r} \mathrm d\mathbf r \int i\frac{y'}{r'}\sin\left(\gamma r'\right) e^{-i(\mathbf p-\mathbf q')\cdot\mathbf r'} \mathrm d\mathbf r'. \end{align} This is actually pretty remarkable: our huge, complicated mess of a density matrix has essentially come down to the incoherent mixture of just two states, one of which is independent of $s$, $$ \langle \mathbf q|\psi_y\rangle = -i \int \frac{y}{r}\sin\left(\gamma r\right) e^{i(\mathbf p-\mathbf q)\cdot\mathbf r} \mathrm d\mathbf r , $$ and which can rightfully be interpreted as a source of noise brought about by the fact that our measurement hamiltonian has $\hat{y}\hat{\sigma}_y$ terms that are at right angles to our spin state $|s\rangle$, and one which does depend on $s$, $$ \langle \mathbf q|\psi_x\rangle = \int \left( \cos\left(\gamma r\right) + is\frac{x}{r}\sin\left(\gamma r\right) \right) e^{i(\mathbf p-\mathbf q)\cdot\mathbf r} \mathrm d\mathbf r , $$ and which if you squint enough sort of begins to look like a momentum state displaced by $s\hbar \gamma$ along $q_x$. To put things in simpler language, for the states defined above we have $$ \hat\rho = |\psi_x\rangle\langle \psi_x| +|\psi_y\rangle\langle \psi_y|, $$ after the interaction with the magnet, and this can be used to find any spatial observable you care to name. If what you want is the pattern observed on the screen, you can just get it directly as $|\langle \mathbf q|\psi_x\rangle|^2 + |\langle \mathbf q|\psi_y\rangle|^2$, i.e. the incoherent mixture of the results from the two pure states above.

Now, those Fourier transforms do look like a pair of bruisers, but they're perfectly doable by going to polar coordinates: the angular integral will return a Bessel function of the form $J_\nu(|\mathbf p-\mathbf q|r)$, and the radial integral can be done in terms of hypergeometric $_2F_1$ functions at arguments that evaluate to rational functions with poles at $(\mathbf p-\mathbf q)^2 = \gamma^2$. (The convergence, on the other hand, will be pretty tricky.) Alternatively, those integrals are probably susceptible to a manageable integration by residues.

More physically, what that means is that the entanglement induced by the interaction with the magnet has spread what used to be a perfectly collimated plane wave $|\mathbf p\rangle|s\rangle$ into a blob that decays algebraically away from its center, but with a rather high likelihood this blob will be displaced along $q_x$ by the expected amount of $s\hbar \gamma$.

(If I have time I'll come back and chase those tails, but frankly it's all footwork now.)


OK, so that was a lot of work and a lot of words, but there's still some loose ends to tie up. In particular, you observe correctly that the vanishing divergence of the magnetic field forces us to have a nonzero gradient in the $y$ direction which we would very much rather ignore, as it would force us into the ugly shenanigans I've been labouring over for the past several pages. Nevertheless, most of the time it seems that you can just forget about that kind of thing and your Stern-Gerlach analysis will work just fine (or so we like to pretend), so - is that true? and if so, why? and more importantly, can that be justified in a solid quantum mechanical manner without handwaving references to "precession" or whatnot?

The answer to that is that, as you note, this kind of thing only really becomes relevant if we turn off the field at the position of the beam. In the more common situation that we have a rather strong magnetic field at the beam, we need to include that into the interaction hamiltonian, and the result of this is that the unitary effected by the magnet will have the form $$ \hat{U}(\tau) = \exp\mathopen{}\left[ -i\left( b\hat{\sigma}_x + \gamma\left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right) \right)\right]\mathclose{} , $$ with a term in $b\tau \hat{\sigma}_x$ that will drive much of the dynamics. This term scuppers all of our work above, because now the square of the exponent does not simplify quite as well as it did above, so all of those manipulations need to be repeated with that in mind: you're still likely to be able to do e.g. $$ \left( b\hat{\sigma}_x + \gamma\left( -\hat{x}\hat{\sigma}_x + \hat{y}\hat{\sigma}_y \right) \right)^2 = (b-\gamma \hat{x})^2+ \gamma^2\hat{y}^2 , $$ and to put that into a simplified expression for $\hat{U}(t)$, but at the end of the day in the limit of large $b/\gamma$ (which is a tricky limit as that quantity is a length) you're likely to get a good approximation to the limit by just ignoring the dynamics along $y$.

That's probably enough of a tome for now, though.


Let's define the operator $G \equiv -p_x\sigma_x + p_y\sigma_y$. This $G$ is a Hermitian operator, so it constitutes a perfectly valid quantum measurement. I think this is essentially equivalent to the measurement you're describing. But when I write it this way it's less mysterious!

For example, here is $G$ in the z-spin-basis: $$G=\left(\begin{matrix}0 & p_x-ip_y \\ p_x + ip_y & 0 \end{matrix}\right)$$ What does $G$ mean? What does it do? I don't have any concise answer, I think you just have to go through some examples. If the spin starts out such-and-such spatial and spin state, and I measure $G$, what are the possible measurement results and final states? I think you can get all kinds of interesting entanglements between x-components and y-components and spin.

Anyway, there's no paradox, and I can't follow exactly why you think there is, so I suggest just working through some examples of $G$ acting on different states yourself...