What is the difference between a symbolic link and a hard link?
Some examples that might help.
Create two files with data in them:
$ printf Cat > foo
$ printf Dog > bar
Create a hard and soft (aka symbolic) link:
$ ln foo foo-hard
$ ln -s bar bar-soft
List directory contents in long format by increasing size:
ls -lrS
lrwxr-xr-x 1 user staff 3 3 Apr 15:25 bar-soft -> bar
-rw-r--r-- 2 user staff 4 3 Apr 15:25 foo-hard
-rw-r--r-- 2 user staff 4 3 Apr 15:25 foo
-rw-r--r-- 1 user staff 4 3 Apr 15:25 bar
This tell us that
1st column: the file mode for the soft and hard links differ
- soft link:
lrwxr-xr-x
- filetype:
l
= symbolic link - owner permissions:
rwx
= readable, writable, executable - group permissions:
r-x
= readable, not writable, executable - other permissions:
r-x
= readable, not writable, executable
- filetype:
- hard link:
-rw-r--r--
- filetype:
-
= regular file - owner permissions:
rw-
= readable, writable, not executable - group permissions:
r--
= readable, not writable, not executable - other permissions:
r--
= readable, not writable, not executable
- filetype:
- soft link:
2nd column: number of links is higher for the hard linked files
5th column: the size of the soft link is smaller, because it's a reference as opposed to a copy
last column: the symbolic link shows the linked-to file via
->
Changing the filename of foo does not affect foo-hard:
$ mv foo foo-new
$ cat foo-hard
Cat
Changing the contents of foo is reflected in foo-hard:
$ printf Dog >> foo
$ cat foo-hard
CatDog
Hard links like foo-hard point to the inode, the contents, of the file.
This is not the case for soft links like bar-soft:
$ mv bar bar-new
$ ls bar-soft
bar-soft
$ cat bar-soft
cat: bar-soft: No such file or directory
The contents of the file could not be found because the soft link points to the name, that was changed, and not to the contents.
Likewise, If foo
is deleted, foo-hard
still holds the contents; if bar
is deleted, bar-soft
is just a link to a non-existing file.
As the saying goes, a picture is worth a thousand words. Here is how I visualize it:
Here is how we get to that picture:
Create a name
myfile.txt
in the file system that points to a new inode (which contains the metadata for the file and points to the blocks of data that contain its contents, i.e. the text "Hello, World!":$ echo 'Hello, World!' > myfile.txt
Create a hard link
my-hard-link
to the filemyfile.txt
, which means "create a file that should point to the same inode thatmyfile.txt
points to":$ ln myfile.txt my-hard-link
Create a soft link
my-soft-link
to the filemyfile.txt
, which means "create a file that should point to the filemyfile.txt
":$ ln -s myfile.txt my-soft-link
Look what will now happen if myfile.txt
is deleted (or moved): my-hard-link
still points to the same contents, and is thus unaffected, whereas my-soft-link
now points to nothing. Other answers discuss the pros/cons of each.
Underneath the file system, files are represented by inodes. (Or is it multiple inodes? Not sure.)
A file in the file system is basically a link to an inode.
A hard link, then, just creates another file with a link to the same underlying inode.
When you delete a file, it removes one link to the underlying inode. The inode is only deleted (or deletable/over-writable) when all links to the inode have been deleted.
A symbolic link is a link to another name in the file system.
Once a hard link has been made the link is to the inode. Deleting, renaming, or moving the original file will not affect the hard link as it links to the underlying inode. Any changes to the data on the inode is reflected in all files that refer to that inode.
Note: Hard links are only valid within the same File System. Symbolic links can span file systems as they are simply the name of another file.
Hard links are useful when the original file is getting moved around. For example, moving a file from /bin to /usr/bin or to /usr/local/bin. Any symlink to the file in /bin would be broken by this, but a hardlink, being a link directly to the inode for the file, wouldn't care.
Hard links may take less disk space as they only take up a directory entry, whereas a symlink needs its own inode to store the name it points to.
Hard links also take less time to resolve - symlinks can point to other symlinks that are in symlinked directories. And some of these could be on NFS or other high-latency file systems, and so could result in network traffic to resolve. Hard links, being always on the same file system, are always resolved in a single look-up, and never involve network latency (if it's a hardlink on an NFS filesystem, the NFS server would do the resolution, and it would be invisible to the client system). Sometimes this is important. Not for me, but I can imagine high-performance systems where this might be important.
I also think things like mmap(2) and even open(2) use the same functionality as hardlinks to keep a file's inode active so that even if the file gets unlink(2)ed, the inode remains to allow the process continued access, and only once the process closes it does the file really go away. This allows for much safer temporary files (if you can get the open and unlink to happen atomically, which there may be a POSIX API for that I'm not remembering, then you really have a safe temporary file) where you can read/write your data without anyone being able to access it. Well, that was true before /proc gave everyone the ability to look at your file descriptors, but that's another story.
Speaking of which, recovering a file that is open in process A, but unlinked on the file system revolves around using hardlinks to recreate the inode links so the file doesn't go away when the process which has it open closes it or goes away.