Newlines in filenames
I've never seen a file name with a newline other than ones deliberately created to test applications that manipulate file names. File names containing newlines can appear because:
- Some bug or user error (e.g. a bad copy-paste) resulted in an unintended file name.
- Some filesystem corruption affected a file name.
- Someone deliberately created a “strange” file name to exploit a security hole, where an application put more trust in the file names it was passed than it should have.
POSIX defines a filename as “a name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the slash character and the null byte. The filenames dot and dot-dot have special meaning.” There is no guarantee that every filesystem will accept “strange” file names (the only guaranteed characters are ASCII letters, digits, period, hyphen and underscore, i.e. A-Z
, a-z
, 0-9
and ._-
, with hyphen forbidden in first position), but most native filesystems on modern unices do.
When writing a paper, I often collect a bibliography of PDF files from various sources. Not all of these contain the correct metadata, which means I sometimes copy-paste the title of the paper from the PDF viewer into the filename. This often results in newlines within the file name, but has never been an issue with any tools I have used.
IMHO there is nothing 'defensive' about coding to a standard.. a standard which states that newlines are allowed in filenames. If your script does not handle all file names allowed in the standard, then your script is broken.
I've never seen NORMAL users use newlines in filenames. It appears that their primary purpose is to (1) make it easy for attackers to subvert your system, and to (2) make it harder to write secure programs :-(. However, modern Unix-likes (such as Linux) allow them, so you have to prepare for them if you want a program that resists attack.
"Filenames and Pathnames in Shell: How to do it correctly" shows how to handle this correctly.