Informational capacity of qubits and photons
For (1), there is a theorem of Holevo that implies you cannot extract more than one bit of information from one qubit. You can indeed encode one bit of information, since the two inputs $| 0 \rangle$ and $| 1 \rangle$ (or any two orthogonal states) are distinguishable. If the sender and receiver share an entangled state, they can use superdense coding to send two bits using one qubit, but this is the maximum.
For (2), if the wavelength is exactly 500 nm (or at least as close as you can get in one hour), then by the uncertainty principle the photon must consist of an hour-long wave train which is a sine wave whose frequency gives 500 nm, and which (aside from its polarization) carries no information. You need non-zero bandwidth to transmit information. If you have a non-zero bandwidth, you can transmit information based on the frequency of the photon. The amount of information you can transmit depends on the bandwidth. The timescale of one hour limits the precision with which you can determine frequency, and this will give you the amount of information you can send. Note that if you use frequency to transmit the information, Alice's random delay doesn't hurt you. If you didn't have a random delay, then you could use the non-zero bandwidth to create wave packets that are localized in time, and use timing to transmit the same amount of information.
Let's do some calculations. Let's suppose the frequency is 500 nm $\pm$ .5 nm; then the frequency is 6*10$^{14}$ $\pm$ 6*10$^{11}$ Hz, and the bandwidth is 1.2*10$^{12}$ Hz. Now, we have $\Delta T\Delta E \geq \hbar$, and $E = h \nu$, so $\Delta T\Delta \nu \geq \frac{1}{2\pi}$. If $T$ is 3600 sec, we can distinguish frequencies to accuracy 4.4*10$^{-5}$ Hz, so we get 2.7*10$^{16}$ different frequencies we can distinguish. Taking the log (base 2), this is around 55 bits. I suspect this is the most information that can be sent with one photon with the given bandwidth, but I don't know a proof of this.
This is all theoretical; doing this in practice would require a ridiculously accurate frequency filter. If you didn't have the random delay, using wave packets localized in time and a very accurate clock would work better in practice, although they won't let you exceed 55 bits, either. But even with the random delay, I wouldn't be surprised if there were clever, experimentally more feasible, ways of communicating with one photon.
A qubit is a unit of information, so the information included in one qubit is exactly one qubit. For most purposes, the information may be identified with the information of one bit. We call it a "quantum bit" because the two possibilities may be combined into arbitrary complex combinations such as $a|0\rangle + b|0\rangle$.
However, the complex amplitudes $a,b$ cannot be measured - at least not by a single measurement. You can measure whether the system described by the qubit is found in the state $|0\rangle$ or $|1\rangle$ (and you may also do similar measurements with any orthogonal basis of the 2-dimensional Hilbert space). If you do so, you obtain either 0 or 1, nothing else. You obtain 0 with the probability $|a|^2$ and 1 with the probability $|b|^2$.
Even if you choose another basis and make the measurement, you get just one bit of information from the measurement. More precisely, you usually get less than one bit - namely $-p_a\ln(p_a)-p_b\ln(p_b)$ where $p_{a}=|a|^2$ and similarly for $b$. The expression is maximized for $p_a=p_b=1/2$ where its value is $\ln(2)$ of information (in "nats") which is called one bit. For asymmetric choices of $p_a,p_b$, we get less than one bit of information from the measurement.
If you repeat the same experiment many times, or $K$ times, you may measure the amplitudes with the relative error of $1/\sqrt{K}$. However, it's the very point of quantum computation - the discipline where the notion of "qubit" actually becomes useful - that you only want to run the quantum algorithm once and get the result. If you needed to run the quantum algorithm many times, to measure the amplitudes, all the magic exponential speedup of quantum computers would be gone.
Moreover, even if you repeatedly try to measure the amplitudes, you won't find the equivalent of four real numbers. First of all, the wave function of the qubit has to be normalized so that the total probability of both/all possibilities is equal to 100 percent. It means that $$|a|^2+|b|^2=1.$$ Moreover, the change of the overall phase i.e. the transformation $$(a,b)\to (a\,\exp(i\phi),b\,\exp(i\phi))$$ is unphysical because the overall phase is unmeasurable by any tools, even in principle. So in fact, the number of measurable information - if you allow repeated measurements of the system in the same initial state - is not four real numbers but just two real numbers. Without a loss of generality, you may write $$(a,b) = (A,e^{iB} \sqrt{1-A^2})$$ where $A,B$ are two real non-negative parameters. $A$ is between $0$ and $1$ while $B$ is between $0$ and $2\pi$.
So I said that it's fallacious to imagine that the amplitudes $a,b$ are classical numbers. They're not classical numbers in any sense - it's always the same debate about the wave function that most laymen always want to imagine as a "real classical wave". It's not a real classical wave. It's just a tool to predict probabilities. And one qubit is not the same thing as a "large number of classical bits". Instead, it is one classical bit that has certain qualitatively new properties. As long as you will try to imagine that quantum mechanics is the same thing as classical physics, just with some bigger objects, you will misunderstand the essence of quantum physics. Quantum physics is qualitatively different than anything we know from the (seemingly) classical world.
Alice, Bob, and photon
If Alice and Bob send signals to one another and the only information they can adjust is the timing of one photon, then the information carried by this timing is simply $\ln(N)/\ln(2)$ bits where $N$ is the number of "moments" or "intervals" that they can distinguish. If you constrain them by what they're allowed to measure, then it's meaningless to talk about quantum bits.
What they send to each other by the timing of the photon is some ordinary classical information. You would have to allow Alice and Bob to measure the interference of various possibilities - various timings - to turn them into a quantum computer, and then it could make some sense to talk about "qubits". However, as you designed it, the information is classical and the unit should be called one bit.