Device vs Partition vs File System vs Volume: how do these concepts relate to each other, accurately
Let's start with some basics:
Data: Data is just a set of bits in order. The interpretation of the contents depend on the application which you use to read the data. Example1: You try it with an text editor, then this application may bundle 8bits and interpret them as ASCII characters. Example2: You try to open the file with an audio player, then it will for example try to put 12 bits together to get one amplitude in the played audio.
Storage Device: A device is a physical storage where you can store data. These are often accessible in a 'Random Access' fashion, e.g. get bit number 1337 -> 1 (simplified). Examples for these devices are: hard disk drives, solid state disks, usb sticks, CDs, DVD, but also Memory of your computer.
These two things are all you need: 1. A device to store/read the data 2. Rules on how to handle the data
Example: Let's say you would copy a binary to the beginning of your hdd and tell your computer to boot from this hdd. The computer will read the first command and execute it, and then read the next command and so on. This is what a bootloader does. At this early stage there are no filesystems, partitions, etc. involved.
At the beginning of software development you didn't 'open a File', you 'read bytes 100 to 180' and work with this data (maybe the 80 bytes are a string or audio data). Working with numbers got annoying (Where does my string start? Was it 40? How long was it again? Which string is this?), so Filesystems where invented:
Filesystem: A Filesystem is just a Layer in order to get some meaning to the bytes. A file in the filesystem is just the information where data starts, how long it is, and a simpler way to address it ('diary.txt' is easier to handle than '4000 Bytes beginning at Byte 500'). Paths and the tree view is just a thing to make it more convenient to find and organize files.
So basically the Filesystem uses data and interprets it as a filesystem. Furthermore it allows the user (or other applications) to access chunks of this data in an easy way. The filesystem does not care where the data is stored, it may come from any device. You can also create a
Example: Filesystem gets data (
[---Data---]
), handles it, and allows to access chunks ([D]
) of the data.[---Data---] -> Filesystem -> [D][D][D][D]
Since a file is just data received from a filesystem, you can install a filesystem in a file. No problem:
`HDD ---> Filesystem ---> File ---> Filesystem ---> File
These are the main concepts in my opinion. You talked about some other things like partitions, (logical) volumes, volume groups, (encryption) container, etc. Don't get confused by these things, these are just other layers in order to organize data. On a closer look you will see that these are basically filesystems. Let's take partitions: A partition contains the information where in the underlying data it starts, how long it is, and a way to address it (e.g. partition number 2). Sounds familiar?
So, what is the Java Developer view on this? Most of the time you will be accessing data through File
. Although it may be total reasonable to write/read to the hdd directly. I think the best approach is: Use the data source which fits your application best:
Example:
- organized data? -> Database
- text? -> File
- partition organizing tool -> Read directly from the device, e.g. /dev/sd0
Hope that helps to clarify some things.
Overall picture (Windows like)
Icons source: vector.me
Disk, drive, partition, volume
Disk or drive: The physical device used to store data. Drive seems more generic than disk which is related to the storage technology, e.g. there are hard disk drive, floppy disk drive and USB Flash drives.
Disks are divided into sectors, each sector contains the same number of bytes. Sectors have a sector number which can be used to reference them individually.
Partition and volume: Often used interchangeably, but it's not the same, there can be multiple volumes within a single partition.
A partition is a chunk of a disk with a specific size (e.g. a specific sector range of a hard disk. Disk partitioning is the act of dividing a disk into multiple chunks as if there were multiple disks. Some partitions may be divided in turn into multiple separate logical chunks, it must be supported by the partitioning scheme used.
The effective chunk (regardless being physical or logical) is called a volume. The raw volume can be later formatted to contain a file system which can itself store actual data.
The operating system needs to keep tracks of the volumes in the system. It's were files and directories are stored.
Partitioning can be done using two main partitioning schemes:
- Master Boot Record (MBR)
- GUID Partition Table (GPT).
MBR
MBR was used with the legacy BIOS firmware. MBR can create up to 4 partitions on a drive, either primary or extended. The visible space on the drive is limited to 2 TB, space in excess cannot be used by partitions.
There can be only one extended partition per drive, this partition can be divided into up to 128 logical volumes.
One primary partition can be selected as the active partition and be used to boot the computer.
GPT
GPT supports drives larger than 2 TB and up to 128 partitions per drive. GPT is not compatible with BIOS, the computer must be configured with the an EFI firmware.
GPT contains a fake MBR at the beginning of its space. This MBR shows the drive as being a single MBR partition to cope with tools which do not recognize GPT.
Image
An image is a snapshot of a volume (files and other data) into a single file, similarly to a zip file. An image from a volume can be expanded on another volume and an image can also be “mounted” or “attached” to appear like any other volume, or appear as a directory of an existing volume.
Additional volumes can be created (“mounted”) from image files without being linked to actual physical units (except the one where the image file is stored).
File system
The file system is used to control how data is stored and retrieved on a volume. It's the practical way to store data organized into files and directories instead of unordered and unrelated bytes.
The file system takes care of the file content and structure (tree). Directories and files are given properties (like read only) and access permissions.
The legacy FAT file system was used with DOS OS. It's still supported by modern devices for compatibility and exchange purposes. FAT versions: FAT12, FAT16, FAT32, correspond to the number of bits used in the file entries, determining the number of sectors which can be referenced. FAT32 can reference 232 = 4,294,967,296 sectors. With sectors of 512 bytes, the FAT32 can therefore manage 2TB.
Modern Windows versions use NTFS. NTFS adds support for metadata, access control list (permissions) and journaling.
MacOS uses APFS.
Linux often defaults to ext4.
Android uses ext4.
Optical disks (CD, DVD, Blu-ray) often use UDF.
Disk, partitions, volumes and file systems on Windows (MBR):
Source
Fragmentation
When the file system cannot allocate contiguous sectors for a file, file content is stored in distant sectors, this fragmentation slows down data access in mechanical devices.
HFS+ and ext4 have fragmentation control mechanisms, but to limit fragmentation, most file systems allocate space for a file by complete blocks/clusters, a block containing a given number of contiguous sectors. For example, NTFS can be configured to allocate 4KB clusters. Some file systems are able to reduce the effective unused space size, but a file usually owns more space than actually required to store data.
Boot support
When BIOS/EFI firmware starts the computer, the file system, which is part of the OS, is not available. BIOS/EFI instead looks for a boot sector (master boot record on PC) written on the boot drive during partitioning and/or OS installation. This code is a bootstrap which is able to load and execute the appropriate code from the active partition to start the main OS components, of which the file system which provides functions to load files. Then the OS takes control of the computer.
Addditionnal sources:
https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc787202(v=ws.10) ✦ https://en.wikiversity.org/wiki/IT_Fundamentals/2014/File_Systems ✦ https://www.howtogeek.com/school/using-windows-admin-tools-like-a-pro/lesson4/?PageSpeed=noscript ✦ https://www.lifewire.com/volume-vs-partition-2260237 ✦ https://en.wikipedia.org/wiki/File_system_fragmentation