How can I have two files with the same name in a directory when mounted with NFS?
A friend helped me track this down and found this is a bug as recorded in Bugzilla 38572 for the Linux kernel here. The bug is supposedly fixed in version 3.0.0 of the kernel, but present at least in version 2.6.38.
The issue is that the server's ReadDIR() RPC call returns incorrect results. This occurs because of the following:
When the client reads a directory, it specifies a maximum buffer size and zeroes a cookie. If the directory is too large, the reply indicates that the reply is only partial and updates the cookie. Then the client can re-execute the RPC with the updated cookie to get the next chunk of data. (The data is sets of file handles and names. In the case of ReadDirPlus(), there is also stat/inode/vnode data.) The documentation does not indicate that this is a bug with ReadDirPlus(), but it probably is there as well.
The actual problem is that the last file in each chunk (name, handle tuple) is sometimes returned as the first file in the next chunk.
There is an bad interaction with the underlying filesystems. Ext4 exhibits this, XFS does not.
This is why the problem appears in some situations but not in others and rarely occurs on small directories. As seen in the question description, the files show the same inode number and the names are identical (not corrupted). Since the Linux kernel calls the vnode operations for underlying operations such as open(), etc., the file system's underlying routines decide what happens. In this case, the NFS3 client just translates the vnode operation into an RPC if the required information isn't in its attribute cache. This leads to confusion since the client believes the server can't do this.
The disk is an NFS mounted disk. When I go to the host computer that publishes the drive, the file is only listed once.
Probably a bug, issue, or race condition with NFS.
It's possible to have two files of the same name if you directly edit the filesystem structures using a hex editor. However I'm not sure what would happen if you try to delete or open the files. I'm unsure of what tools exist on Linux to access a file by inode number (which can't be duplicated) but that may work.
Duplicate file names are something fsck
would likely catch and try to fix.
Make sure none of the files have differing trailing spaces though.
There is a chance that you have a hidden non-printable character or whitespace in one of the filenames. You can check with by providing the -b
option to ls
, e.g.:
user@server:~/test$ ls -lab
total 8
drwxr-xr-x 2 user user 4096 Sep 3 12:20 .
drwx------ 8 user user 4096 Sep 3 12:20 ..
-rw-r--r-- 1 user user 0 Sep 3 12:19 hello
-rw-r--r-- 1 user user 0 Sep 3 12:19 hello\
Note the \
signifying the space at the end of that filename.
-b, --escape
print C-style escapes for nongraphic characters
As an alternative (though the above should work), you can pipe the output through this perl script to replace anything that isn't a printable ASCII character with its hex code. For example, a space becomes \x20
.
while (<>) {
chomp();
while (/(.)/g) {
$c = $1;
if ($c=~/[!-~]/) {
print("$c");
} else {
printf("\\x%.2x", ord($c));
}
}
print("\n");
}
Usage:
ls -la | perl -e 'while(<>){chomp();while(/(.)/g){$c=$1;if($c=~/[!-~]/){print("$c");}else{printf("\\x%.2x",ord($c));}}print("\n");}'