How can we see an atom now? What was the scale of this equipment?

The questions of whether you can detect light emitted from an (isolated) atom and whether you can resolve an atom from its neighbours are completely independent.

The spacing between different atoms in a regular material remains impossible to resolve using visible light, whose wavelength is several thousand times larger. You can "see" individual atoms by using other microscopy techniques (so see e.g. this short film for a nice example), but those are using rather elaborate instrumentation and post-processing, and they do not reflect what is visible to the naked human eye.

The picture you're quoting, however, does not image one atom out of many in a material. Instead, it really is a single isolated atom, held in a vacuum by a set of electric "tweezers" called an ion trap (itself produced by the metal electrodes that surround the atom, which will be a couple of centimetres across), and which is emitting light via fluorescence (i.e. it is being excited by a laser and re-emitting that light). The size of the atom as it appears in the picture has nothing to do with its actual size: as far as the camera is concerned, the atom is a point source, and the nonzero spread in the image is caused by the finite resolution of the camera.

Thus, assuming that the trapped atom is bright enough, it could in principle be seen with the naked eye, in which case it would look much like a star on a clear, still night (which are also point sources as far as our eyes are concerned, though their appearance then gets changed by twinkling). Whether the experimental configurations in actual use are enough to produce atoms that are bright enough to see with the naked eye is a good question; my understanding is that this isn't quite possible, but that with a completely dark background it isn't that far out of reach.

That does mean that a human wouldn't be able to see both the atom itself and the trap electrodes simultaneously, since you require a completely dark background to begin to have a chance at seeing the atom. As for the camera, the author has clarified in a comment that it's a single thirty-second exposure, with the electrodes illuminated by a camera flash halfway through the exposure.


Finally, to address your expanded question,

If that single atom is being held there by a field, why are the atoms of that very field not visible?

the answer is that the field that is holding it up is not made of atoms at all. The atom in the picture is being held in place by electrostatic forces, which are the same forces that you use to pull up bits of paper with a balloon that you've rubbed against your hair. Electrostatic forces, like magnetic forces and gravity, are said to form a field, but it's a force field that's all force and no atoms. The effect here is analogous to magnetic levitation, except that you use electric fields (carefully engineered ones, produced by the metal electrodes that surround the atom in the picture) instead of magnets.


To be fair, this is actually explained in your link. To put it simply,

If you illuminate it with the right light, it starts shining so bright that a good camera can detect it.

To make it work, the atom has to be as motionless as possible. This is achieved by "freezing it" and using magnets to hold it still.

Close-up for completeness:

enter image description here


While the physics has already been covered in other answers, let me give you an idea about how to explain the difference between detection and resolution to a 4-year old:

Try an analogy. Something you can't resolve individually but see pretty easily. The fist thing that comes to mind is lights at a distance. A bunch of LEDs at a distance might do it, your computer/TV screen, one of those big screens you can find on buildings, the lit (or dark) windows of a far away house, letters on a piece of paper and probably a lot of things I can't think of now.

The principle stays the same: Choose the right lighting conditions and the right distance and it is easy to see, if a single "pixel" is lit or not. But can you distinguish between one pixel or two? Can you count the pixels if all are lit (a computer screen is probably perfect for this one)? Can you tell where one pixel ends and where another begins?

Ok, the analogy does not explain the resolution limits, but I think with a 4-year old you can get quite a good feeling for the difference between detection and resolution, and for "if I look closer, I see more details - but maybe I can not look close enough without a lot of effort".