Stimulated emission: how can giving energy to electrons make them decay to a lower state?

Do you want a technical quantum mechanics explanation, or a hand-wavy non-technical explanation?

The hand-wavy non-technical explanation: photons are bosons, and bosons like being together (unlike fermions, who are loners).

So suppose you're sitting in your apartment and wondering whether you want to go see the new Star Wars movie. Possibly you'll end up gathering enough energy to go to the theater and buy a ticket, but probably not. Then five of your friends knock on your door and say "let's go see the new Star War movie". You throw on your jacket and go with them.

That's more or less how stimulated emission works, but of course to be precise, you need to write down the quantum mechanical equations that tell you just how much photons like being with their friends, and see that it really works. And for photons, if thousands of their friends come by, they're much more likely to go than if just five of them do (while in real life, if thousands of your friends came by, the smart thing to do would be to decide that the theater was going to sell out and stay home).

There's an easy way to show that what you might expect naively—that there is no difference between rates of spontaneous emission and simulated emission—doesn't work. Recall that quantum mechanics is reversible. What this means is that absorption and emission should work the same. Now, suppose you have $n$ photons that illuminate an atom in the ground state. You would naively expect that the rate of absorption would be $n$ times the rate with one photon. And in fact, that's correct.

Now, suppose you have an atom in the excited state, and you shine $n-1$ photons on it. The process of its decaying and going to the ground state, with $n$ photons leaving it is exactly the reverse as the atom being excited when it's illuminated by $n$ photons, with $n-1$ photons leaving it. So the rate of stimulated emission should be $n$ times the rate of spontaneous emission. (This is complicated a little bit because there's only one mode an atom can decay into in stimulated emission, while there may be more than one for spontaneous emission.)

This explanation probably still isn't entirely satisfactory, because it doesn't justify the fact that $n$ photons will excite an atom in the ground state at $n$ times the rate that one photon will. The essential reason this happens is that the creation operator $a^\dagger$ satisfies $a^{\dagger } | n\rangle ={\sqrt {n+1}} \,| n+1\rangle$. Julian Ingham's answer explains this in more detail.


Edit: I've edited this answer to add more intuitive explanations, see the end. The electrons don't receive energy from the photons; it's just that the initial presence of $N$ photons makes the probability of the electron emitting another photon more likely. "Dipoles" and "population inversion" are actually irrelevant.

Peter Shor's answer is a nice intuitive sketch, but here's the mathematical presentation he/OP requested.

Quick run-through of quantum electrodynamics, then it will be clear: recall that the interaction between charged fields and the photon is given by \begin{equation} \mathscr{V}_{int}=e\int (\hat{j}\hat{A}) d^3x \end{equation} We can decompose the free electromagnetic field into a sum of photon creation annihilation operators \begin{equation} \hat{A}=\sum_{n}\left(\hat{c}_nA_n(x)+\hat{c}^\dagger_nA^*_n(x)\right) \end{equation} As we know from the harmonic oscillator, each operator has matrix elements only for an increase or decrease of the corresponding occupation number $N_n$ (the number of photons of type $n$; by type we mean of a given frequency/wavevector, since we count the number of photons of different frequencies separately) which differ by one. That is, only processes of the emission or absorption of a single photon occur in the first approximation of perturbation theory. (Though again, in analogy with the harmonic oscillator, we know that at the $m$th order in perturbation theory, $m$-photon processes are possible ie matrix elements connecting $N_n$ and $N_n\pm m$. Quantitatively, the matrix elements of the operators $c_n$ are given by \begin{equation} \langle N_n|c^\dagger_n| N_n-1\rangle=\langle N_n-1|c_n|N_n\rangle=\sqrt{N_n} \end{equation} (The convention is that $c_n$ are the usual "$a_n$", but with a factor of $\sqrt{2\pi/\omega}$ absorbed into them).

Investigating the probability of an absorption/emission process requires perturbation theory. Let us assume for simplicity that the initial and final states of the emitting/absorbing system belong to the discrete spectrum. Then the probability rate is given by the Fermi golden rule \begin{equation} dw=2\pi |\mathscr{V}_{fi} |^2 \delta\left(E_i-E_f-\omega\right) d\nu \end{equation} We have adopted the normalisation of the photon wavefunction so that there is one photon per volume V, and the photon wavefunction is normalised by integrating over $d\nu$. The bottom line here is that the probability rate is proportional to the square of the matrix element of $\mathscr{V}$ between the initial and final state.

Okay so here's the punchline: if the initial state of the field already has a non zero number $N_n$ of the photons in question, the matrix element for the transition is multiplied by \begin{align} \langle N_n+1|c^\dagger_n|N_n\rangle=\sqrt{N_n+1} \end{align} ie the transition probability, which involves the square of the matrix element, gets multiplied by $N_n+1$. The 1 in this factor corresponds to the $\textbf{spontaneous emission}$ which occurs even if $N_n=0$. The term $N_n$ represents the $\textbf{stimulated or induced emission}$: the presence of photons in the initial state of the field stimulates the further emission of photons of the same kind. The hand waving explanation is exactly that photons are bosons, see Peter Shor's answer. This is also the same "$N+1$" phenomenon cited in a newer answer, which involves the example of a molecular toy Hamiltonian.

Incidentally, we can obtain the Einstein relations from here with minimal effort: the matrix element for the opposite change of state will be proportional to \begin{align} \langle N_n-1|c_n| N_n\rangle=\sqrt{N_n} \end{align} and so the emission and absorption probabilities for a given pair of states are related by \begin{equation} w_e/w_a=(N_n+1)/N_n \end{equation}

$\textbf{Edit:}$ $\textit{Some further questions elaborated.}$

As was stated in Peter Shor's answer, one way of thinking about this is that the factor of $(N_n+1)$ appearing in the probability rate is due to the fact that photons are bosons, and "like to group together" to go see Star Wars movies. Photons of a certain frequency in the initial state encourage there to be another photon of such a frequency in the final state, and the electron obliges by emitting this photon. There's an important point here too: which is that the photons of type $n$ ie frequency $\omega_n$ in the initial state encourage there to be more photons of the same type $n$ frequency $\omega_n$ in the final state. So the photon the electron spits out by stimulated emission is $\textit{in phase}$ with the original photons - ie of the same type. All this is simply a consequence of the algebra of bosonic creation/annihilation operators. It's not the case that energy has been "given to" the electrons in any way: clearly, it is the electron that has given up energy to the photon bunch, because it has emitted a photon. What happened is that the probability rate of the electron doing that has been increased.

Steven Sagona asks: $\textit{"why do atoms have such a Hamiltonian"?}$ The $j\cdot A$ Hamiltonian is the Hamiltonian of electromagnetism. All interactions between photons and matter are described by this Hamiltonian, as this is the only Hamiltonian allowed by gauge invariance and Lorentz invariance.

Another question is asking for the role of dipole moments and population inversion. Neither of these are actually necessary to understand the notion of stimulated emission, which is simply our factor of $N_n$, as explained. For completeness we'll give a quick explanation of the role of those terms in laser physics.

The way a laser works is essentially: you put energy into the system - "pumping" - and thereby drive the atoms into excited states. Population inversion is simply the situation when you have more atoms in excited states than in the ground state. Then you expose your excited atoms to photons, and the electrons are stimulated to drop back down to the ground state and spit out photons that are in-phase ("of the same type") as the incident photons, for the reasons explained above. Then those stimulatedly-emitted photons fly around bumping into more electrons, and cause them to undergo stimulated emission, and so on in a snowballing effect of more and more in-phase photons, until you gradually run out of your excited electrons. This gives you a whole bunch of coherent photons. Again, no dipoles necessary here.

If we wanted to calculate the emission rates more exactly, we'd have to calculate $\mathscr{V}_{fi}$. When the wavelength of the photon is large compared to the size of the atom, the dominant contribution to this matrix element is from dipole radiation. There are selection rules that determine whether an initial and final state can be connected by a dipole transition, https://en.wikipedia.org/wiki/Selection_rule. We can calculate $\mathscr{V}_{fi}$ more precisely by expanding our expression for $j\cdot A$ in a multipole expansion. I could step through all these details mathematically but it would be overkill - the basic point is that the symmetries of the states the electron is jumping between determine whether that process is allowed or not. Practically, for the snowball process explained above to work, you want the electrons to stay in their excited states for a long time (ie you want them to be metastable) so that the photons get a chance to reach them and snowball off them. The origin of metastable states is usually that: spontaneously jumping from that metastable state to the ground state is forbidden by a selection rule https://en.wikipedia.org/wiki/Metastability#Atomic_and_molecular_physics so falling out of the metastable state is unlikely. This means the probability of the electron spontaneously returning to the ground state is low, but the probability of it returning to the ground state via stimulated emission can be high due to that large factor of $N_n$ compensating. This is good: spontaneous emission spits out random out-of-phase photons, but we want stimulated emission so that we can have in phase photons (that's the point of a laser). So selection rules allow us to choose good metastable states, and that's what allows us to make the most of those excited atoms and get as many stimulated emission events out of them before they all de-excite. But this is a system dependent detail, and plays no essential role in the phenomenon of stimulated emission per se - it's a practical necessity needed to ensure the electrons in a laser stay excited long enough to undergo stimulated emission.


The question called for a detailed answer, so I'll show an explicit calculation, using the Schrödinger equation, in a toy model that exhibits stimulated emission. Most of the effort goes into constructing the model and explaining what the various pieces mean. Once this is done, the calculation itself is relatively quick and easy, and the interpretation of the result is straightforward.


The model

A simple type of laser works by putting the molecules of the lasing material into a relatively long-lived excited state, one that would eventually decay on its own (releasing a photon) even if it were not "stimulated." If it does decay on its own, the emitted photon is in a superposition of different momenta, with no preference for momenta parallel to the long axis of the laser. The model will illustrate what happens when other photons, emitted by other previously-excited molecules, are already present. The model includes:

  • a single two-level molecule;

  • two different photon modes, representing two different momenta with the same magnitude and different (say, orthogonal) directions.

The model involves two parameters:

  • a real parameter $\lambda$ that determines the strength of the interaction between the molecule and the photons;

  • a real parameter $\omega$ representing the energy of the molecule's excited state (relative to the ground state). The same parameter $\omega$ also represents the energy of a single photon (either mode).

Units with $\hbar=1$ are being used here. Altogether, the Hamiltonian is $$ H = \omega\, a^\dagger a + \omega\, b^\dagger b + \omega\, c^\dagger c + \lambda \big(c^\dagger (a+b) + (a+b)^\dagger c\big), \tag{1} $$ where $a,b,c$ are operators having the following significance:

  • $a^\dagger$ and $a$ are the creation and annihilation operators, respectively, for photons with one momentum;

  • $b^\dagger$ and $b$ are the creation and annihilation operators, respectively, for photons with the other momentum;

  • the operator $c^\dagger$ promotes the molecule from its ground state to the excited state, and the operator $c$ moves it from the excited state back to the ground state.

To ensure that the model involves only two energy levels for the molecule, the operators $c,c^\dagger$ are taken to satisfy the anticommutation relations $$ cc = 0 \hskip2cm c^\dagger c^\dagger = 0 \hskip2cm cc^\dagger+c^\dagger c = 1. \tag{2} $$ In contrast, the photon operators $a,b$ satisfy the usual boson commutation relations $$ aa^\dagger-a^\dagger a=1 \hskip2cm bb^\dagger-b^\dagger b=1 \tag{3} $$ and:

  • $a$ and $a^\dagger$ commute with $b$ and $b^\dagger$

  • $a$ and $a^\dagger$ commute with $c$ and $c^\dagger$

  • $b$ and $b^\dagger$ commute with $c$ and $c^\dagger$

The interaction terms in the Hamiltonian, the terms multiplied by $\lambda$, are $$ c^\dagger (a+b) \hskip1cm \text{and} \hskip1cm (a+b)^\dagger c. $$ The first one describes the absorption of an $a$-photon or $b$-photon by the molecule, and the second one describes emission. Both terms must be present because the Hamiltonian must be self-adjoint. To complete the definition of the model, let $|0\rangle$ denote the state with no photons and in which the molecule is in its ground state, so $$ a|0\rangle=0 \hskip2cm b|0\rangle=0 \hskip2cm c|0\rangle=0. \tag{4} $$ Now, suppose that the molecule has been prepared in its excited state and that $N$ photons are already present in mode $a$, so the initial state of the system is $$ |\psi(0)\rangle = \big(a^\dagger\big)^N c^\dagger|0\rangle. \tag{5} $$ Working in the Schrödinger picture, the state evolves in time according to $$ i\frac{\partial}{\partial t}|\psi(t)\rangle = H|\psi(t)\rangle $$ with $H$ given by (1).


The calculation

At the initial time $t=0$, the right-hand side can be evaluated explicitly: \begin{align*} \left.i\frac{\partial}{\partial t}|\psi(t)\rangle\,\right|_{t=0} &= (N+1)\omega\,|\psi(0)\rangle + \lambda \big(a^\dagger\big)^N (a^\dagger+b^\dagger)|0\rangle \\ &= (N+1)\omega\,|\psi(0)\rangle + |A\rangle+|B\rangle \tag{6} \end{align*} with $$ |A\rangle \equiv \lambda \big(a^\dagger\big)^{N+1}|0\rangle \hskip2cm |B\rangle \equiv \lambda \big(a^\dagger\big)^{N}b^\dagger |0\rangle. \tag{7} $$ The interaction term involving $c^\dagger$ does not contribute to (6), because $(c^\dagger)^2=0$. The commutation relations for the photon operators imply $$ \frac{\langle A|A\rangle}{\langle B|B\rangle} =\frac{(N+1)!}{N!} = N+1. \tag{8} $$ To derive (8) quickly, notice that equation (3) says that $a$ acts formally like the "derivative" with respect to $a^\dagger$, so $$ a\big(a^\dagger\big)^n|0\rangle=n\big(a^\dagger\big)^{n-1}|0\rangle. $$


Interpretation

Now consider the significance of the result (6)-(8). The right-hand side of (6) is a quantum superposition of three terms:

  • a term proportional to $|\psi(0)\rangle$ in which the molecule has not yet decayed,

  • a term $|A\rangle$ in which the molecule has decayed by emitting an $a$-photon,

  • a term $|B\rangle$ in which the molecule has decayed by emitting a $b$-photon.

Of course, this represents only the initial trend, because equation (6) is evaluated at $t=0$. But for the purpose of building intuition with relatively little calculation, this is sufficient.

First consider the case $N=0$, representing the situation with no photons present in the initial state, so the molecule decays on its own, without stimulation. In this case, equation (8) says that the $|A\rangle$ and $|B\rangle$ terms have the same magnitude, so equation (6) says that the photon is emitted in an equal superposition of both momenta, with no preference for either one. This is spontaneous emission.

Now consider the case $N\geq 1$, representing the situation with one or more $a$-photons present in the initial state. In this case, equation (8) says that the squared-magnitude of the $|A\rangle$ term is greater than the squared-magnitude of the $|B\rangle$ term by a factor of $N+1\geq 2$. Therefore, although the photon is still emitted in a superposition of both momenta because both terms are present in equation (6), it is now emitted preferentially with the $a$-momentum because the $A$ term in equation (6) has a larger magnitude than the $B$ term. The ratio $N+1$ says that the more $a$-photons are present in the initial state, the stronger this preference is. This is stimulated emission.

This simple model did not account for the walls that contain the lasing material, but we can suppose that the walls are designed (using mirrors, etc) so that photons in mode $a$ (say, with momentum parallel to the long axis of the laser) remain in the lasing cavity longer than photons in mode $b$. This introduces a slight tendency to have more $a$-photons than $b$-photons after the initially-excited molecules begin to decay, and then the stimulated-emission effect amplifies this tendency more and more strongly as the number of $a$-photons increases. Eventually, the number of $a$-photons being emitted (stimulated or otherwise) balances the number of $a$-photons being absorbed (the Hamiltonian (1) includes both terms), and the process plateaus.


Edit: These clarifications were posted as comments, but the trail of comments was becoming long, so I moved the clarifications into this appendix.

As a comment pointed out, this simple model is oversimplified in several respects. In particular, it includes only two photon momenta. A more realistic model should include many photon momenta, and a proof that lasing actually occurs would need to show that the effect of stimulation in a small fraction of those modes is sufficient. However, the purpose of the simple model presented here is not to try to prove that lasing occurs; the purpose is to illustrate the phenomenon of stimulated emission in a simple way.

Another concern was raised about treating the norm-squared of a term on the right-hand side of (6) as a transtition probability. That was not the intent. Equation (8) is only meant to say that in equation (6), the contribution of the $A$ term is (initially) growing faster than that of the $B$ term. With a single photon as the stimulator, the emission for that one mode will be enhanced relative to other modes; but emission in the other modes still occurs. Before we interrupt things with a measurement, all of these things are occurring continuously together as part of the quantum superposition according to the Schrödinger equation, but some contributions are growing faster than others, which will affect the distribtion of outcomes when a measurement finally does occur.

A comment by Steven Sagona mentioned that true single-photon sources are difficult to prepare. A more realistic source might prepare a state like $$ |0\rangle +\alpha a^\dagger|0\rangle +\frac{1}{2}(\alpha a^\dagger)^2|0\rangle +\cdots $$ with a relatively small magnitude of the coefficient $|\alpha|$, so that higher-order terms are negligible. To analyze stimulated emission when the simulating photon(s) come from such a source, we can simply replace equation (5) with a superposition involving different values of $N$ (such as $N=0$, $1$, and $2$). Since the Schrödinger equation is linear, this has the effect of replacing equations (7) with the corresponding superpositions. By comparing the norm of each term having a $b$-photon with the associated term that has an extra $a$-photon instead, we again conclude that the latter term is growing faster (at least initially) than the former in terms where at least one $a$-photon was present initially. The overall effect is weaker because the dominant term (the one with no photons present initially) does not include any stimulation, but the stimulated emission effect still occurs in the other terms (the ones that do have photons present initially).

That comment raises an interesting point. Even in this single-molecule model, and even with a true single-photon stimulus so that there is no entanglement in the initial state, the output light still comes out entangled with the molecule. This trend is already evident in equation (6), whose right-hand side is a superposition of two terms:

  • A term with $N$ photons and an excited molecule (the term involving $\omega$)

  • A term with $N+1$ photons and a relaxed molecule (the term $|A\rangle+|B\rangle$).

The entanglement is even more pronounced in a model with lots of molecules, because the final state is a superposition of many different numbers of molecules having emitted their photons. Since it's entangled, exactly what pure state (if any) best represents the output light (e.g., a coherent state) can be a tricky question, one whose answer probably requires careful consideration of "decoherence".

The original question was:

How can "giving" energy (in the form of photons) to electrons stimulate them to come to a lower energy state?

The key message of this answer is that stimulated emission is not about giving energy to the molecule. Energy must be given to the molecule in order to put it into the excited state in the first place; but the phenomenon of stimulated emission occurs because photons are bosons, as expressed by equation (3). This is what leads to the factor $N+1$ in equation (8).