Electron shells in atoms: What causes them to exist as they do?

Any answer based on analogies rather than mathematics is going to be misleading, so please bear this in mind when you read this.

Most of us will have discovered that if you tie one end of a rope to a wall and wave the other you can get standing waves on it like this:

Standing waves

Depending on how fast you wave the end of the rope you can get half a wave (A), one wave (B), one and a half waves (C), and so on. But you can't have 3/5 of a wave or 4.4328425 waves. You can only have a half integral number of waves. The number of waves is quantised.

This is basically why electron energies in an atom are quantised. You've probably heard that electrons behave as waves as well as particles. Well if you're trying to cram an electron into a confined space you'll only be able to do so if the electron wavelength fits neatly into the space. This is a lot more complicated than just waving a rope because an atom is a 3D object so you have 3D waves. However take for example the first three $s$ wavefunctions, which are spherically symmetric, and look how they vary with distance - you get (these are for a hydrogen atom) $^1$:

s wavefunctions

Unlike the rope the waves aren't all the same size and length because the potential around a hydrogen atom varies with distance, however you can see a general similarity with the first three modes of the rope.

And that's basically it. Energy increases with decreasing wavelength, so the "half wave" $1s$ level has a lower energy than the "one wave" $2s$ level, and the $2s$ has a lower energy than the "one and a half wave" $3s$ level.


$^1$ the graphs are actually the electron probability distribution $P(r) = \psi\psi^*4\pi r^2$. I did try plotting the wavefunction, but it was less visually effective.


First of all, strictly speaking, electron shells (as well as atomic orbitals) do not exist in atoms with more than one electron. Such physical model of an atom is simplified (and often oversimplified), it arises from a mathematical approximation, which physically corresponds to the situation when electrons do not instantaneously interact with each other, but rather each and every electron interacts with the average, or mean, electric field created by all other electrons.

This approximation is known as mean field approximation, and the state (or, speaking classically, the motion) of each and every electron in this approximation is independent of the state (motion) of all other electrons in the system. Thus, the physical model which arises due to this approximation is simplified, and, not surprisingly, it is often referred to as independent electrons model.

So, the question why nature works in this way, does not make a lot of sense, since nature actually does not work this way. Except for systems with only one electron, like, for instance, hydrogen atom. In any case the answer to the question why something works in this or that way in physics is pretty simple: according to the laws of a particular physical theory, say, quantum mechanics. And I could not explain to you quantum mechanics here in just a few sentences. You need to read some books.

But if your question is why nature works in this way according to quantum mechanics, i.e. why things in quantum mechanics are the way they are, then I would like to quote Paul Dirac:

[...] the main object of physical science is not the provision of pictures, but is the formulation of laws governing phenomena and the application of these laws to the discovery of new phenomena. If a picture exists, so much the better; but whether a picture exists or not is a matter of only secondary importance. In the case of atomic phenomena no picture can be expected to exist in the usual sense of the word 'picture', by which is meant a model functioning essentially on classical lines. One may, however, extend the meaning of the word 'picture' to include any way of looking at the fundamental laws which makes their self-consistency obvious. With this extension, one may gradually acquire a picture of atomic phenomena by becoming familiar with the laws of the quantum theory.

From "The Principles of Quantum Mechanics", §4.


A big part of it can be explained by combining the constraints of quantum mechanics with the geometry of angular momentum.

For the special case of the hydrogen atom, it turns out that when you solve the equations of motion for an electron near a proton, you can't give the electron any old energy. There's a set of energies that are allowed; all others are excluded. You can put these energies in order, starting from the most tightly bound, and give each one a number. This is often called the "principal quantum number," $n$, and it can be any positive integer. The binding energy of an electron in the $n$-th state is $13.6\,\mathrm{eV}/n^2$.

You can also ask (again, using the mathematical tools of quantum mechanics) whether the electron can carry angular momentum. It turns out that it can, but again that the amount of angular momentum it can carry comes in lumps, and again we can put the angular momentum states in order, starting with the least. Unlike with the principal quantum number, it makes sense to talk about an atom whose angular momentum is zero, so the "angular momentum quantum number" $\ell$ starts counting from zero. For a very sneaky reason, $\ell$ must be smaller than $n$. So an electron in its ground state, $n=1$, must have $\ell=0$; an electron in the first excited state $n=2$ may have $\ell=0$ or $\ell=1$; and so on.

Now once you have started to ask about angular momentum you start to think about planets orbiting a star, and that suggests a question: what is the orientation of the orbit? Must all the electrons orbit in the same plane, like all the planets in the solar system are found roughly along the plane of the ecliptic? Or can electrons orbiting a nucleus occupy any random plane, the way that comets do? This is a question you can also address with quantum mechanics. It turns out (again) that only certain orientations are allowed, and the number of orientations that are allowed depends on $\ell$, and that you can put the orientations in order. For a state with $\ell=0$ there is only one orientation permitted. For a state with $\ell=1$ there are three orientations permitted; sometimes it makes sense to number them with the "angular momentum projection quantum number" $m \in \{-1,0,1\}$, and other times it makes sense to identify them with the three axes $x,y,z$ of a coordinate system. For $\ell=2$, likewise, it sometimes makes sense to identify orientations $m \in \{-2,-1,0,1,2\}$, and other time to identify the orientations with electrons along the axes and planes of the coordinate system. I think the chemists may even have a geometrical interpretation for the seven substates of $\ell=3$, but I'm not familiar with it.

When you start to add multiple electrons to one nucleus, several things change — most notably the interaction energy, since the electrons interact with each other as well as with the nucleus. The basic picture, that each electron must carry integer angular momentum $\ell$ which may lie on any of $2\ell+1$ directions, remains unchanged. But there is one final quirk: each state with a given $n,\ell,m$ may hold no more than two electrons! We can fit this into our picture by assigning each electron a fourth quantum number $s$, called the "spin quantum number" for reasons that you should totally look up later, which can only take two values. Now we have a very simple rule: a "state" described by the four numbers $n,\ell,m,s$ can hold zero or one electrons at a time.

After that preamble, have a look at a periodic table: a periodic table

  • Over on the left are two columns of highly reactive elements. These have the outermost electron with $\ell=0$ (one value of $m$ allowed, two values of $s$).

  • Over on the right are six columns of (mostly) nonmetals. These have the outermost electron with $\ell=1$ (three values of $m$ allowed, times two values of $s$)

  • In the middle are ten columns of metals. These have outermost electrons with $\ell=2$ (five values of $m$ allowed, times two values of $s$).

  • Appended on the bottom of the chart, because there's too much blank space on the page if they're inserted between columns two and three, are fourteen columns of lanthanides and actinides. These have outermost electrons with $\ell=3$ (seven values of $m$, times two values of $s$).

This simple model doesn't explain everything about the periodic table and electron shells. My description puts helium in the wrong spot (it's not a reactive metal because the most tightly bound electron shell is special), and the heavier metals leak over into the $\ell=1$ block. You have to do some serious modeling to understand why the $\ell=2$ electrons aren't allowed until the fourth row, rather than the third row. Protons and neutrons in the nucleus have the same sort of shell structure, but nuclear magic numbers don't always occur after the filling of an $\ell=1$ shell the way the noble gases do. But that is about the shape of things.