To what extent does Linux support file names longer than 255 bytes?
The answer, as often, is “it depends”.
Looking at the NTFS implementation in particular, it reports a maximum file name length of 255 to statvfs
callers, so callers which interpret that as a 255-byte limit might pre-emptively avoid file names which would be valid on NTFS. However, most programs don’t check this (or even NAME_MAX
) ahead of time, and rely on ENAMETOOLONG
errors to catch errors. In most cases, the important limit is PATH_MAX
, not NAME_MAX
; that’s what’s typically used to allocate buffers when manipulating file names (for programs that don’t allocate path buffers dynamically, as expected by OSes like the Hurd which doesn't have arbitrary limits).
The NTFS implementation itself doesn’t check file name lengths in bytes, but always as 2-byte characters; file names which can’t be represented in an array of 255 2-byte elements will cause a ENAMETOOLONG
error.
Note that NTFS is generally handled by a FUSE driver on Linux. The kernel driver currently only supports UCS-2 characters, but the FUSE driver supports UTF-16 surrogate pairs (with the corresponding reduction in character length).
The limit for the length of a filename is indeed coded inside the filesystem, e.g. ext4
, from https://en.wikipedia.org/wiki/Ext4 :
Max. filename length 255 bytes
From https://en.wikipedia.org/wiki/XFS :
Max. filename length 255 bytes
From https://en.wikipedia.org/wiki/Btrfs :
Max. filename length 255 ASCII characters (fewer for multibyte character encodings such as Unicode)
From https://en.wikipedia.org/wiki/NTFS :
Max. filename length 255 UTF-16 code units
An overview over these limits for a number of file systems can be found at https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits . There you can also see that ReiserFS has a higher limit (almost 4K) but the kernel itself (inside VFS, the kernel virtual filesystem) has the limit of 255 bytes.
Your text uses 160 UTF-16 characters as used in NTFS:
echo ゆく河の流れは絶えずして、しかももとの水にあらず。よどみに浮かぶうたかたは、かつ消えかつ結びて、久しくとどまりたるためしなし。世の中にある人とすみかと、またかくのごとし。たましきの都のうちに、棟を並べ、甍を争へる、高き、卑しき、人の住まひは、世々を経て尽きせぬものなれど、これをまことかと尋ぬれば、昔ありし家はまれなり。 > jp.txt
iconv -f utf-8 -t utf-16 jp.txt > jp16.txt
ls -ld jp*.txt
cat jp16.txt | hexdump -C
This shows 0x140 = 320 bytes (plus 2 bytes prepended byte order mark (BOM) if used). In other words, 160 UTF-16 characters and therefore below the 255 UTF-16 character limit in NTFS but more than 255 bytes.
(ignoring the newline character here)
So, here's what I've found out.
Coreutils don't particularly care about filename length and simply work with user input regardless of its length, i.e. there are zero checks.
I.e. this works (filename length in bytes 462!):
name="和総坂裁精座回資国定裁出観産大掲記労。基利婚岡第員連聞余枚転屋内分。妹販得野取戦名力共重懲好海。要中心和権瓦教雪外間代円題気変知。貴金長情質思毎標豊装欺期権自馬。訓発宮汚祈子報議広組歴職囲世階沙飲。賞携映麻署来掲給見囲優治落取池塚賀残除捜。三売師定短部北自景訴層海全子相表。著漫寺対表前始稿殺法際込五新店広。"
cd /mnt/ntfs
touch "$name"
Even this works
echo 123 > "$name"
cat "$name"
123
However once you try to copy the said file to any of your classic Linux filesystems, the operation will fail:
cp "$name" /tmp
cp: cannot stat '/tmp/和総坂裁精座回資国定裁出観産大掲記労。基利婚岡第員連聞余枚転屋内分。妹販得野取戦名力共重懲好海。要中心和権瓦教雪外間代円題気変知。貴金長情質思毎標豊装欺期権自馬。訓発宮汚祈子報議広組歴職囲世階沙飲。賞携映麻署来掲給見囲優治落取池塚賀残除捜。三売師定短部北自景訴層海全子相表。著漫寺対表前始稿殺法際込五新店広。': File name too long
I.e. cp
has actually attempted to create this file in /tmp
but /tmp
doesn't allow filenames longer than 255 bytes.
Also I've managed to open this file in mousepad (a GTK application), edit and save it - it all worked which means 255 bytes restriction applies only to certain Linux filesystems.
This doesn't mean everything will work. For instance my favorite console file manager, Midnight Commander, a clone of Norton Commander - cannot list (shows file size as 0), open, or do anything with this file:
Error
No such file or directory (2)