Do file-extensions have any purpose (for the operating system)?
There is no 100% black or white answer here.
Usually Linux does not rely on file names (and file extensions i.e. the part of the file name after the normally last period) and instead determines the file type by examining the first few bytes of its content and comparing that to a list of known magic numbers.
For example all Bitmap image files (usually with name extension .bmp
) must start with the letters BM
in their first two bytes. Scripts in most scripting languages like Bash, Python, Perl, AWK, etc. (basically everything that treats lines starting with #
as comment) may contain a shebang like #!/bin/bash
as first line. This special comment tells the system with which application to open the file.
So normally the operating system relies on the file content and not its name to determine the file type, but stating that file extensions are never needed on Linux is only half of the truth.
Applications may of course implement their file checks however they want, which includes verifying the file name and extension. An example is the Eye of Gnome (eog
, standard picture viewer) which determines the image format by the file extension and throws an error if it does not match the content. Whether this is a bug or a feature can be discussed...
However, even some parts of the operating system rely on file name extensions, e.g. when parsing your software sources files in /etc/apt/sources.list.d/
- only files with the *.list
extension get parsed all others are ignored. It's maybe not mainly used to determine the file type here but rather to enable/disable parsing of some files, but it's still a file extension that affects how the system treats a file.
And of course the human user profits most from file extensions as that makes the type of a file obvious and also allows multiple files with the same base name and different extensions like site.html
, site.php
, site.js
, site.css
etc. The disadvantage is of course that file extension and the actual file type/content do not necessarily have to match.
Additionally it's needed for cross-platform interoperability, as e.g. Windows will not know what to do with a readme
file, but only a readme.txt
.
Linux determines the type of a file via a code in the file header. It doesn't depend on file extensions for to know with software is to use for opening the file.
That's what I remember from my education. Please correct me in case I'm wrong!
- correctly remembered.
Are these extensions are meant only for humans?
- Yes, with a but.
When you interact with other operating systems that do depend on extensions being what they are it is the smarter idea to use those.
In Windows, opening software is attached to the extensions.
Opening a text file named "file" is harder in Windows than opening the same file named "file.txt" (you will need to switch the file open dialog from *.txt
to *.*
every time). The same goes for TAB and semi-colon separated text files. The same goes for importing and exporting e-mails (.mbox extension).
In particular when you code software. Opening a file named "software1" that is an HTML file and "software2" that is a JavaScript file becomes more difficult compared to "software.html" and "software.js".
If there is a system in place in Linux where file extensions are important, I would call that a bug. When software depends on file extensions, that is exploitable. We use an interpreter directive to identify what a file is ("the first two bytes in a file can be the characters "#!", which constitute a magic number (hexadecimal 23 and 21, the ASCII values of "#" and "!") often referred to as shebang,").
The most famous problem with file extensions was LOVE-LETTER-FOR-YOU.TXT.vbs on Windows. This is a visual basic script being shown in file explorer as a text file.
In Ubuntu when you start a file from Nautilus you get a warning what it is going to do. Executing a script from Nautilus where it wants to start some software where it is supposed to open gEdit is obvious a problem and we get a warning about it.
In command line when you execute something, you can visually see what the extension is. If it ends on .vbs I would start to become suspicious (not that .vbs is executable on Linux. At least not without some more effort ;) ).
As mentioned by others, in Linux an interpreter directive method is used (storing some metadata in a file as a header or magic number so the correct interpreter can be told to read it) rather than the filename extension association method used by Windows.
This means you can create a file with almost any name you like... with a few exceptions
However
I would like to add a word of caution.
If you have some files on your system from a system that uses filename association, the files may not have those magic numbers or headers. Filename extensions are used to identify these files by applications that are able to read them, and you may experience some unexpected effects if you rename such files. For example:
If you rename a file My Novel.doc
to My-Novel
, Libreoffice will still be able to open it, but it will open as 'Untitled' and you will have to name it again in order to save it (Libreoffice adds an extension by default, so you would then have two files My-Novel
and My-Novel.odt
, which could be annoying)
More seriously, if you rename a file My Spreadsheet.xlsx to My-Spreadsheet, then try to open it with xdg-open My-Spreadsheet
you will get this (because it's actually a compressed file):
And if you rename a file My Spreadsheet.xls
to My-Spreadsheet
, when you xdg-open My-Spreadsheet
you get an error saying
error opening location: No application is registered as handling this file
(Although in both these cases it works OK if you do soffice My-Spreadsheet
)
If you then rename the extensionless file to My-Spreadsheet.ods
with mv
and try to open it you will get this:
(repair fails)
And you will have to put the original extension back on to open the file correctly (you can then convert the format if you wish)
TL;DR:
If you have non-native files with name extensions, don't remove the extensions assuming everything will be OK!