Why case-insensitive option in ext4 was needed?
case-insensitive file system allows us to resolve important bottlenecks for applications being ported from other operating systems
does not reach my heart and I cannot understand how the process of normalization and casefolding allow us to optimize our disk storage.
Wine, Samba, and Android have to provide case-insensitive filesystem semantics. If the underlying filesystem is case-sensitive, every time a case-sensitive lookup fails, Wine et al. has to scan each directory to prove there are no case-insensitive matches (e.g. if looking up /foo/bar/readme.txt
fails, you have to perform a full directory listing and case-folded comparison of all files in foo/bar/*
and all directories in foo/*
, and /*
).
There are a few problems with this:
- It can get very slow with deeply nested paths (which can generate hundreds of FS calls) or directories with tens of thousands of files (i.e. storing incremental backups over SMB).
- These checks introduce race conditions.
- It's fundamentally unsound: if both
readme.txt
andREADME.txt
exist but an application asks forREADME.TXT
, which file is returned is undefined.
Android went so far as to emulate case-insensitivity using FUSE/wrapfs and then the in-kernel SDCardFS. However, SDCardFS just made everything faster by moving the process into kenel space†. It still had to walk the filesystem (and was thus IO bound), introduced race conditions, and was fundamentally unsound. Hence why Google funded† development of native per-directory case-insensitivity in F2FS and have since deprecated SDCardFS.
There have been multiple attempts in the past to enable case-insensitive lookups via VFS. The most recent attempt in 2018 allowed mounting a case-insensitive view of the filesystem. Ted Tso specifically cited the issues with wrapfs for adding this functionality, as it would at least be faster and (I believe) free of race conditions. However, it was still unsound (requesting README.TXT
could return readme.txt
or README.txt
). This was rejected in favor of just adding per-directory support for case-insensitivity and is unlikely to ever make it into VFS††.
Furthermore, users expect case-insensitivity thus any consumer oriented operating system has to provide it. Unix couldn't supported it natively because Unicode didn't exist and strings were just bags-of-bytes. There are plenty of valid criticisms of how case-folding was handled in the past, but Unicode provides an immutable case-fold function that works for all but a single locale (Turkic, and even then it's just two codepoints). And the filesystem b-tree is the only reasonable place to implement this behavior.
†AFAICT
††I emailed Krisman, the author of both the VFS-based case-insensitive lookups and per-directory case-insensitive support on EXT4 and F2FS.
Other operating systems have case insensitive filesystem.
As example: MacOS permit case-insensitive (as default) or case-sensitive. Adobe Photoshop and Adobe Lightroom doesn't work well with case-sensitive file system. This means that within Adobe programs, there are probably hardcoded paths, written in different ways (maybe "Documents" and "documents" in the different libraries, or just sometime some filters are applied (e.g. lowercase and removing spaces, which may differ from the path of the data). Nobody cared, because it just work.
So, if now you want to port a programs made for some common proprietary operating system of our epoch, either you should fix all paths, so that you have always a consistent use of filename cases, or you prefer to have a filesystem which handle these for you.
Adobe could not do it for MacOS, so expect things are much more difficult (and costly) for other vendors. See https://helpx.adobe.com/creative-suite/kb/error-case-sensitive-drives-supported.html