The difference between initrd and initramfs?
I think you are right in all.
The difference is easy to see if you follow the steps needed when booting:
initrd
- A
ramdev
block device is created. It is a ram-based block device, that is a simulated hard disk that uses memory instead of physical disks. - The
initrd
file is read and unzipped into the device, as if you didzcat initrd | dd of=/dev/ram0
or something similar. - The
initrd
contains an image of a filesystem, so now you can mount the filesystem as usual:mount /dev/ram0 /root
. Naturally, filesystems need a driver, so if you use ext2, the ext2 driver has to be compiled in-kernel. - Done!
initramfs
- A
tmpfs
is mounted:mount -t tmpfs nodev /root
. The tmpfs doesn't need a driver, it is always on-kernel. No device needed, no additional drivers. - The
initramfs
is uncompressed directly into this new filesystem:zcat initramfs | cpio -i
, or similar. - Done!
And yes, it is still called initrd
in many places although it is a initramfs
, particularly in boot loaders, as for them it is just a BLOB. The difference is made by the OS when it boots.
Dentry (and inode) cache
Filesystem subsystem in Linux has three layers. The VFS (virtual filesystem), which implements the system calls interface and handles crossing mountpoints and default permission and limits checks. Below it are the drivers for individual filesystems and those in turn interface to drivers for block devices (disks, memory cards, etc.; network interfaces are exception).
The interface between VFS and filesystem are several classes (it's plain C, so structures containing pointers to functions and such, but it's object-oriented interface conceptually). The main three classes are inode
, which describes any object (file or directory) in a filesystem, dentry
, which describes entry in a directory and file
, which describes file open by a process. When mounted, the filesystem driver creates inode
and dentry
for it's root and the other ones are created on demand when process wants to access a file and eventually expired. That's a dentry and inode cache.
Yes, it does mean that for every open file and any directory down to root there has to be inode
and dentry
structures allocated in kernel memory representing it.
Page cache
In Linux, each memory page that contains userland data is represented by unified page
structure. This might mark the page as either anonymous (might be swapped to swap space if available) or associate it with inode
on some filesystem (might be written back to and re-read from the filesystem) and it can be part of any number of memory maps, i.e. visible in address space of some process. The sum of all pages currently loaded in memory is the page cache.
The pages are used to implement mmap interface and while regular read and write system calls can be implemented by the filesystem by other means, majority of interfaces uses generic function that also uses pages. There are generic functions, that when file read is requested allocate pages and call the filesystem to fill them in, one by one. For block-device-based filesystem, it just calculates appropriate addresses and delegates this filling to the block device driver.
ramdev (ramdisk)
Ramdev is regular block device. This allows layering any filesystem on top of it, but it is restricted by the block device interface. And that has just methods to fill in a page allocated by the caller and write it back. That's exactly what is needed for real block devices like disks, memory cards, USB mass storage and such, but for ramdisk it means, that the data exist in memory twice, once in the memory of the ramdev and once in the memory allocated by the caller.
This is the old way of implementing initrd
. From times when initrd was rare and exotic occurence.
tmpfs
Tmpfs is different. It's a dummy filesystem. The methods it provides to VFS are the absolute bare minimum to make it work (as such it's excellent documentation of what the inode, dentry and file methods should do). Files only exist if there is corresponding inode and dentry in the inode cache, created when the file is created and never expired unless the file is deleted. The pages are associated to files when data is written and otherwise behave as anonymous ones (data may be stored to swap, page
structures remain in use as long as the file exists).
This means there are no extra copies of the data in memory and the whole thing is a lot simpler and due to that slightly faster too. It simply uses the data structures, that serve as cache for any other filesystem, as it's primary storage.
This is the new way of implementing initrd
(initramfs
, but the image is still called just initrd
).
It is also the way of implementing "posix shared memory" (which simply means tmpfs is mounted on /dev/shm
and applications are free to create files there and mmap them; simple and efficient) and recently even /tmp
and /run
(or /var/run
) often have tmpfs mounted especially on notebooks to keep disks from having to spin up or avoid some wear in case of SSDs.