Why is the root directory denoted by a / sign?
The forward slash /
is the delimiting character which separates directories in paths in Unix-like operating systems. This character seems to have been chosen sometime in the 1970's, and according to anecdotal sources, the reasons might be related to that the predecessor to Unix, the Multics operating system, used the >
character as path separator, but the designers of Unix had already reserved the characters >
and <
to signify I/O redirection on the shell command line well before they had a multi-level file system. So when the time came to design the filesystem, they had to find another character to signify pathname element separation.
A thing to note here is that in the Lear-Siegler ADM-3A terminal in common use during the 1970's, from which amongst other things the practice of using the ~
character to represent the home directory originates, the / key is next to the > key:
As for why the root directory is denoted by a single /
, it is a convention most likely influenced by the fact that the root directory is the top-level directory of the directory hierarchy, and while other directories may be beneath it, there usually isn't a reason to refer to anything outside the root directory. Similarly the directory entry itself has no name, because it's the boundary of the visible directory tree.
The first hierarchical file system as we know it today was designed for Multics. The design is described in “A General-Purpose File System For Secondary Storage” by R.C. Daley and P.G. Neumann. A salient characteristic of this filesystem is that a directory is a file which can be contained in a directory like any other file. The file structure forms a tree, in which all of the non-leaf nodes are directories. The root of the tree is always a directory. Each file has a name (the entry name) which is unique within its parent directory. The root directory doesn't have a name since it isn't contained in another directory.
In order to designate a file, you need to describe the path from the root of the tree. Multics adopted a natural syntax for path names where if P
is the path to a directory and F
is the name of a file, then P>F
is the syntax for the file called F
inside the directory whose path is P
.
For those times when you don't want to burden yourself with directories, Multics had a notion of working directory. A bare file name with no directory indication is interpreted as a file in the working directory.
Combining these rules, foo
is a file in the working directory; foo>bar
is a file in the child directory foo
of the working directory, and so on. These rules describe relative paths, but a supplementary rule is needed to build absolute paths starting from the root directory. Given that reading a path name from left to right corresponds to moving from the root to the leaves of the tree, the root should be indicated by a special marker at the left of the path name. Since file names are never empty (because that would often be confusing), no relative path name ever starts with the character >
, which makes it a convenient marker for absolute path names. Thus >foo
is the file called foo
in the root directory, >foo>bar
is the file called bar
in the directory called foo
in the root directory, and so on. This leaves the root directory, which could be the empty string; however, it's often not convenient to use the empty string as a pathname, so instead it gets written >
, which has the added benefit that a pathname is absolute if and only if its first character is >
.
Unix adopted this design from Multics. Since Unix had already used the character >
for output redirection in its command shell, its designers chose a different character /
to separate directories in path names.
In path name components on Unix, only two characters may not be used: the null character, which terminates strings in C (the language of the kernel) and the slash, which is reserved as the path separator. Furthermore, path components cannot be empty strings.
So, in a path name, we have only two kinds of tokens: a slash, and a component.
Suppose that, without adding any new tokens, we would like to support support two types of paths, relative and absolute. Furthermore, we would like to be able to refer to the root directory, which has no name (it has no parent which would give it a name).
How can we represent relative paths, absolute paths, and refer to the root directory, using only the slash?
The most obvious way to extend a language (other than introduction of new token) is to create new syntax: give new meaning to combinations of tokens that are invalid syntax.
Paths which begin with a slash do not make sense, so why not use a leading slash as a marker which indicates "this path is absolute, rather than relative".
A path which contains nothing but a slash is also invalid, so why not assign it the meaning "the root directory".
These two meanings tie together because an absolute path begins searching at the root directory. In other words a leading slash can be regarded as having the meaning:
- navigate to the root directory, and consume the slash character.
- if there is more material in the path, then process it it as a relative path, otherwise you're done.
Then, we might as well throw in a trailing slash, which can mean "this path asserts that the last path component is the name of a directory rather than a regular file or any other type of object: that trailing slash denotes that directory similarly to the way the leading slash denotes the root directory."
With all this above syntax, we still have syntax with an unassigned meaning: double slashes, triple slashes, and so forth.
Why not just introduce another token and do it differently. This is probably because the designers took minimalistic approaches in general. (Why does the ed
editor only display a ?
when you do something wrong?) The slash is easy to type, requiring no shift. A path language with only two token types (component and slash) is easy to remember and use.
Another important consideration is that easy manipulations of paths are possible using only string representations. For instance, we can "re-root" absolute paths to a new parent directory quite easily:
OLD_PATH=/old/path
NEW_HOME=/new/home
NEW_PATH="$NEW_HOME$OLD_PATH" /new/home/old/path
This would not work if we indicated absolute paths in some other way, like a leading dollar sign or whatever else:
OLD_PATH=^old/path # ^ means absolute path
NEW_HOME=^new/home
# now we need more string kung-fu than just catenation
NEW_PATH="$NEW_HOME/${OLD_PATH#^}"
This type of coding is still needed in some cases when dealing with Unix-style paths, but there is less of it.