Difference between spin and polarization of a photon
The short answer is that the spin states of a photon come in two kinds, based on helicity, how the circular polarization tracks with the direction of the photons momentum. You can think of them as circularly polarized in the sense that we can define the relative relationship between the different polarizations the same way we do for classical electromagnetic waves (even though a single photon is not a classical electromagnetic wave), but we'll use the same math and the same terminology.
So I'll talk about polarization of classical electromagnetic waves just because you've already seen it. Imagine a wave travelling in the $z$ direction with the electric field always pointing in the same direction, say $\pm x$. This is called a linearly polarized wave. Same if the wave traveled in the $z$ direction and the electric field was in the plus or minus y direction. If those two waves were in phase and had the same magnitude, then their superposition would be a wave that oscillates at the same frequency/wavelength as the previous waves, and is still linearly polarized but this time not in the $x$ or $y$ direction but instead in the direction $45$ degrees (halfway) between them. Basically if the electric field always points in plus or minus the same direction, then that's linear polarization, and it could in theory be in any direction by adjusting the relative magnitude of an $x$ polarized one and a $y$ polarized one (that are in phase with each other).
OK, what if they aren't in phase, what if they they are a quarter of a period out of phase, then when the x direction is big the y direction is zero, so it points entirely in the x direction, then later it is entirely in the y direction, and so its direction moves in a circle (if the magnitudes of the out of phase fields in the x and y direction are the same magnitude the head does move in a circle, otherwise the head moves in an ellipse). If instead you put them three quarters of a a period out of phase, they will go in a circle in the opposite direction. The waves where the head of the electric field move in a circle are called circularly polarized waves.
OK, that's it for classical waves. You could discuss how photons make up classical waves, but that's not really what the question is about. The question is about spin for photons. And spin states for the photon come in two kinds, and the names for the positive spin $|+\hbar\rangle$ and the negative spin $|-\hbar\rangle$ are plus $|+\rangle$ and minus $|-\rangle$ and you can treat them just like the circularly polarized states.
Now we're going to steal some math and some terminology. Think of multiplying by $i$ as changing the phase of the wave by a quarter period, then we built up a circular polarization by $X+iY$ and the other circular polarization by $X+iii Y=X-iY$ so given two circular polarization you see that we can add them to get a linearly polarized state $|+\rangle + |-\rangle$ gives one of the linearly polarized states and $-i(|+\rangle - |-\rangle)$ gives a linearly polarized state orthogonal to the other one. We can borrow all the math and terminology from the classical waves, and associate the spin states of the photon with the right and left circularly polarized waves.
We are stealing the math and stealing the terminology, but the fact is that we have two vectors $|+\rangle$ and $|-\rangle$ and they span a (complex) two space of possibilities and the basis $$\left\{(|+\rangle + |-\rangle), -i(|+\rangle - |-\rangle) \right\}$$ would work just as well. We could also use $$\left\{((|+\rangle + |-\rangle) - i(|+\rangle - |-\rangle)),((|+\rangle + |-\rangle) +i(|+\rangle - |-\rangle))\right\}$$ which are two more linearly polarized states. Mathematically the spin states are like the left and right circularly polarized waves, so their sum and difference are like the $x$ and $y$ polarized waves but one of them shifted by a phase, and the $45$ degrees tilted ones really are literal sums and differences of the $x$ and $y$ (in phase) waves.
So $\{ |+\rangle , |-\rangle \}$ is one basis,
$\left\{(|+\rangle + |-\rangle), -i(|+\rangle - |-\rangle) \right\}$ is another basis and
$\left\{((|+\rangle + |-\rangle) - i(|+\rangle - |-\rangle)),((|+\rangle + |-\rangle) +i(|+\rangle - |-\rangle))\right\}$ is a third basis.
Each basis can the property that it is equal parts any one from the other two basis sets. And that's what the key distribution is based on. Just having multiple basis for a two dimensional set of states. All I've done above is write everything in terms of the spin states. Mathematically any basis is fine, and all three of these are equally nice in that within a basis the two are orthogonal to each other, and if you pick one from one basis it has equal sized dot products with either of the ones from the other sets.
Worrying about how these relate to classical waves is a distraction since it is the borrowing of the math and the terminology that is going on.
The classical light beam emerges from a synergy of photons.
Photons, as quantum mechanical entities , are described by the solution of their quantum mechanical equation, a wave function. This equation, if you can follow the link, is a quantized version of Maxwell's equations in their potential form, acting on the photon wave function.
The state function of each photon is described by a complex number, there exists an amplitude whose square gives the probability of finding the photon at (x,y,z) at time t, and a given phase . In an ensemble of photons the phases will build up the electric and magnetic fields that are seen macroscopically.
Polarisation of the classical light means that the electric and magnetic fields are built up in a specific way, linear or circular. An innumerable number of photons contribute to the build up . Each individual photon will have its spin either along the direction of motion or against it, the synergistically built up electric field which defines macroscopic polarization is not a simple addition. This wiki link gives the mathematics of how this happens, and it needs second quantization.
Left and right handed circular polarization, and their associate angular momenta.
Please note that the individual photons have spin either along or against their direction of motion, while the electric fields are perpendicular. These are built up non-trivially, it is the handedness of the electric field vector ( which defines polarization classically) as it progresses in space and time that connects the electric fields to the spin direction.
What we call spin has really little to do with quantum mechanics and more to do with group theory and representations of the Lorentz group. Even before quantizing, the Dirac field, and the EM field, transform in a certain way under Lorentz transformations, and their transformation properties are captured by their spin. The reason these things are quantized is because of compactness of rotations in 3D, the same reason sound waves in a tube are quantized, and again has nothing to do with quantum mechanics, Hilbert spaces, etc.
It is important to realize that what folks usually think of as undergraduate single-particle quantum mechanics is actually classical field theory with a bit of Hilbert space stuff bolted on. It's only through quantum field theory that quantization is taken all the way. The single particle S.E. with spin is actually an approximation to the non-quantum relativistic Dirac equation, and the spin comes from this field being a spinor field. It's only once you quantize this field that you can claim that you are doing quantum mechanics. But to reduce the mental burden in undergraduate physics, we restrict ourselves to the 1-particle state of this quantum field, and these 1-particle states obey the classical Dirac equation (or, at low energy, the S.E.). When you talk about Stern-Gerlach experiments, you should isolate the quantum (measurement, probabilities, projection) from the not-strictly-quantum (spin), just as you can for spin-less particles. There we can measure position, but we don't claim that position is an inherently quantum idea with no classical analog. (I should stress that when physicists say classical, they often mean not quantum, not necessarily pre-1900s).
Now, it's a historical accident that we discovered the "classical" Dirac field a bit after/or at the same time as quantum mechanics, so people tend to confuse what is quantum and what is not. However, the same thing happens with E&M. There, the classical field needs to be quantized, and we end up with multi-photon states. But historically we discovered the E&M field first, long before quantum mechanics. The EM field, being a vector, transforms as spin 1, but because we can't go into the photon rest frame, and because of gauge invariance, only 2 possible components of spin can be measured. It's instructive to look up Wigner's classification and little groups.