How was the first atomic clock calibrated?

More specifically, caesium atomic clocks realize the second (see this Q&A for the meaning of realization) or, said another way, they are a primary frequency standard. Generally, when a new primary standard is being developed—for whatever quantity, not only time—and has not yet become, by international agreement, a primary standard, it should be calibrated against the primary standards of the time.

The first caesium atomic clocks were developed during the 1950s (the first prototype was that of Essen and Parry in 1955, at the National Physical Laboratory, UK). At the time, the second was defined as the fraction 1/86400 of the mean solar day, which is an astronomical unit of time, that is, based on the rotation of the Earth and its motion in the solar system. So the first atomic clock should have been calibrated against that definition of time, which was in operation up to the 1960.

However, scientists already knew that due to the irregularities of the Earth's motion, the mean solar time was not a good time scale and had already started to devise a new time scale based on the ephemeris time. This was recognized as a more stable time scale, even before its implementation, and so the first accurate measurement of the frequency of a caesium atomic clock was made in the 1958 against the ephemeris second (whose definition would be ratified by the CGPM only in 1960), obtaining the value

$$\nu_\mathrm{Cs} = (9\,192\,631\,770\pm 20)\,\mathrm{Hz}$$

Note that since there is no device generating the ephemeris time, which should be obtained from the analysis of the earth and moon motions, this determination took about three years! When the second was redefined as an atomic unit in 1967, the above value was used to define exactly the frequency associated to the hyperfine transition of the caesium ground level (see the 1967 resolution of the CGPM).

It's also worth noting that the relative uncertainty of that measurement is of about $2\times 10^{-9}$; nowadays, caesium atomic clocks can be compared with relative uncertainties, limited by the clock instability, of around $10^{-16}$, and even better uncertainty, around $10^{-18}$, can be achieved in the comparison of optical atomic clocks. Quite a remarkable improvement from those days!

For more information about this history, I suggest you the following wonderful book (though not up to date with the current state-of-the-art):

C. Audoin and B. Guinot, The measurement of time. Time, frequency and the atomic clock (Cambridge, 2001).

The description of said experiment can be found in:

W. Markowitz et al., "Frequency of Cesium in Terms of Ephemeris Time", Phys. Rev. Lett., 1, 105-107, 1958.

L. Essen et al., "Variation in the Speed of Rotation of the Earth since June 1955", Nature 181, 1054, 1958

For a bit of background, atomic clocks take advantage of the physics of magnetic resonance for which Isidor Rabi was awarded the Nobel in 1944 for his investigations. This led to the Stern-Gerlach experiment that forms the core physics package of a caesium primary standard atomic clock. I mention primary standard here as there are other caesium clocks which are not primary standards and use techniques like coherent population trapping to realize their clock. Stern also received the Nobel in 1943, however it was not for the Stern-Gerlach experiment.

So how does this relate to the caesium clock? The Stern-Gerlach experiment basically described is a mass spectroscopy experiment which can discriminate between atoms with different spin moments. In the original experiment, Stern used silver, however caseium is better suited to a clock since there are two and only two ground states. Caesium in a vacuum is heated in an oven that produces a beam of caesium atoms escaping from the oven through a first magnet that selects (steers) them through a microwave cavity tuned to the 9192... MHz that corresponds to the "definition" of a second. This pumps the caseium atoms into the desired hyperfine state which emerge from the microwave cavity through second magnet that steers resonate atoms to a detector where they are counted. Non-resonate atoms are steered away from the detector and are not counted. By carefully varying the frequency of the 9192... MHz microwave field, and observing the intensity (count) of the Cs atoms at the detector, a feedback loop is established that directly relates the microwave frequency to the resonance of the Cs atom in the desired state. Maximizing the intensity of the detected Cs atoms by varying the microwave frequency, in effect provides a "count" of the 9192...MHz from the SI definition.

This microwave frequency source is thus locked to atoms and can be divided down to the output frequencies normally provided by a Cs reference clock (typically 10Mhz, 5MHz and 10.24 MHz) Normally a 1PPS (once a second) pulse is also provided which is accurate to a small number of ns. The stability of a Cs primary reference clock exhibits an Allan deviation of about $10^{-15}$. This is exceeded only by hydrogen masers at $10^{-17}$ and newer optical clocks which have even better stability. Not bad for technology that was invented in the 1920s and first realized as a clock by Louis Essen and Jack Parry in 1955. The Cs clock was predated by an NH3 clock at NIST (then NBS) in 1949 but it was less accurate than quartz at the time.

So to come back to the original question, the Cs primary reference standard does not need calibration since it relies directly on the physics. Since it is a mechanical system however, there are a lot of systemics which must be accounted for, for example, the stability of the oven temperature, the quality of the microwave source, the ability to control the microwave frequency, the physical characteristics of the electronic components used to divide the microwave frequency to output frequencies and a host of others. In theory one properly constructed Cs clock should be the same as the next, to get an idea of how true this is, one can compare the performance of the clock ensembles at the various national labs like NIST an USNO.

As for the GPS aspect of your question, there are some interesting facts. First the GPS constellation has a relativistic compensation for the fact that the satellites are moving through space at sufficient velocity that without this compensation the 10.23 MHz output frequency would appear incorrect to an earth observer. As a result the actual clock frequency is slightly lower. Also the realization of the second was modified to take the gravity well effect into account since atomic clocks run faster the higher in altitude they are. There is a good but casual experiment by Tom Van Baak (Project Great) at that demonstrates this effect. There is also reported a NIST experiment where an optical clock was used to detect a change in elevation as small as 1 meter - I do not remember the reference for this however.

Also, I have constrained this to Cs primary standard clocks, there are several other atomic clocks based on H2, Rb, NH3, Hg and Yb. Each has there advantages and disadvantages.

I especially want to thank Robert Lutwak for giving me the opportunity to learn all this wonderful stuff and work on the first commercial chip scale atomic clock - the Symmetricom CSAC. Who knew you could do so much with a $130\,\mathrm{mW}$ power budget and $1\,\mathrm{cm}^3$ volume?

References (web based to be easy to find):

Rabi, Stern-Gerlach, and Magnetic Resonance

A media treatment of the same subjects: - a resource for time-nuts - some links are broken :-( - all things weights and measures Timekeeping at the US Naval Observatory the NIST Time and Frequency Division in Boulder

Allan Variance and Clock Stability - by the person it is named after...

Of course one must acknowledge the HP5071 (and predecessors) and the team that built these extremely rugged devices... Len Cutler, Robin Giffard, et. al.

For those interested in the history of timekeeping: