flock(): removing locked file without race condition?
- In Unix it is possible to delete a file while it is opened - the inode will be kept until all processes have ended that have it in their file descriptor list
- In Unix it is possible to check that a file has been removed from all directories by checking the link count as it becomes zero
So instead of comparing the ino-value of the old/new file paths you can simply check the nlink count on the file that is already open. It assumes that it is just an ephemeral lock file and not a real mutex resource or device.
lockfile = "/tmp/some_name.lock";
for(int attempt; attempt < timeout; ++attempt) {
int fd = open(lockfile, O_CREAT, 0444);
int done = flock(fd, LOCK_EX | LOCK_NB);
if (done != 0) {
close(fd);
sleep(1); // lock held by another proc
continue;
}
struct stat st0;
fstat(fd, &st0);
if(st0.st_nlink == 0) {
close(fd); // lockfile deleted, create a new one
continue;
}
do_something();
unlink(lockfile); // nlink :=0 before releasing the lock
flock(fd, LOCK_UN);
close(fd); // release the ino if no other proc
return true;
}
return false;
Sorry if I reply to a dead question:
After locking the file, open another copy of it, fstat both copies and check the inode number, like this:
lockfile = "/tmp/some_name.lock";
while(1) {
fd = open(lockfile, O_CREAT);
flock(fd, LOCK_EX);
fstat(fd, &st0);
stat(lockfile, &st1);
if(st0.st_ino == st1.st_ino) break;
close(fd);
}
do_something();
unlink(lockfile);
flock(fd, LOCK_UN);
This prevents the race condition, because if a program holds a lock on a file that is still on the file system, every other program that has a leftover file will have a wrong inode number.
I actually proved it in the state-machine model, using the following properties:
If P_i has a descriptor locked on the filesystem then no other process is in the critical section.
If P_i is after the stat with the right inode or in the critical section it has the descriptor locked on the filesystem.