How does a kernel mount the root partition?
In ancient times, the kernel was hard coded to know the device major/minor number of the root fs and mounted that device after initializing all device drivers, which were built into the kernel. The rdev
utility could be used to modify the root device number in the kernel image without having to recompile it.
Eventually boot loaders came along and could pass a command line to the kernel. If the root=
argument was passed, that told the kernel where the root fs was instead of the built in value. The drivers needed to access that still had to be built into the kernel. While the argument looks like a normal device node in the /dev
directory, there obviously is no /dev
directory before the root fs is mounted, so the kernel can not look up a dev node there. Instead, certain well known device names are hard coded into the kernel so the string can be translated to the device number. Because of this, the kernel can recognize things like /dev/sda1
, but not more exotic things like /dev/mapper/vg0-root
or a volume UUID.
Later, the initrd
came into the picture. Along with the kernel, the boot loader would load the initrd
image, which was some kind of compressed filesystem image (gzipped ext2 image, gzipped romfs image, squashfs finally became dominant). The kernel would decompress this image into a ramdisk and mount the ramdisk as the root fs. This image contained some additional drivers and boot scripts instead of a real init
. These boot scripts performed various tasks to recognize hardware, activate things like raid arrays and LVM, detect UUIDs, and parse the kernel command line to find the real root, which could now be specified by UUID, volume label and other advanced things. It then mounted the real root fs in /initrd
, then executed the pivot_root
system call to have the kernel swap /
and /initrd
, then exec /sbin/init
on the real root, which would then unmount /initrd
and free the ramdisk.
Finally, today we have the initramfs
. This is similar to the initrd
, but instead of being a compressed filesystem image that is loaded into a ramdisk, it is a compressed cpio archive. A tmpfs is mounted as the root, and the archive is extracted there. Instead of using pivot_root
, which was regarded as a dirty hack, the initramfs
boot scripts mount the real root in /root
, delete all files in the tmpfs root, then chroot
into /root
, and exec /sbin/init
.
Linux initially boots with a ramdisk (called an initrd
, for "INITial RamDisk") as /
. This disk has just enough on it to be able to find the real root partition (including any driver and filesystem modules required). It mounts the root partition onto a temporary mount point on the initrd
, then invokes pivot_root(8)
to swap the root and temporary mount points, leaving the initrd
in a position to be umount
ed and the actual root filesystem on /
.
Sounds like you're asking how does the kernel "know" which partition is the root partition, without access to configuration files on /etc.
The kernel can accept command line arguments like any other program. GRUB, or most other bootloaders can accept command line arguments as user input, or store them and make various combinations of command line arguments available via a menu. The bootloader passes the command line arguments to the kernel when it loads it (I don't know the name or mechanics of this convention but it's probably similar to how an application receives command line arguments from a calling process in a running kernel).
One of those command line options is root
, where you can specify the root filesystem, i.e. root=/dev/sda1
.
If the kernel uses an initrd, the bootloader is responsible for telling the kernel where it is, or putting the initrd in a standard memory location (I think) - that's at least the way it works on my Guruplug.
It's entirely possible to not specify one and then have your kernel panic immediately after starting complaining that it can't find a root filesystem.
There might be other ways of passing this option to the kernel.