How to make reading and writing the same file in the same pipeline always "fail"?
Why there is a race condition
The two sides of a pipe are executed in parallel, not one after the other. There's a very simple way to demonstrate this: run
time sleep 1 | sleep 1
This takes one second, not two.
The shell starts two child processes and waits for both of them to complete. These two processes execute in parallel: the only reason why one of them would synchronize with the other is when it needs to wait for the other. The most common point of synchronization is when the right-hand side blocks waiting for data to read on its standard input, and becomes unblocked when the left-hand side writes more data. The converse can also happen, when the right-hand side is slow to read data and the left-hand side blocks in its write operation until the right-hand side reads more data (there is a buffer in the pipe itself, managed by the kernel, but it has a small maximum size).
To observe a point of synchronization, observe the following commands (sh -x
prints each command as it executes it):
time sh -x -c '{ sleep 1; echo a; } | { cat; }'
time sh -x -c '{ echo a; sleep 1; } | { cat; }'
time sh -x -c '{ echo a; sleep 1; } | { sleep 1; cat; }'
time sh -x -c '{ sleep 2; echo a; } | { cat; sleep 1; }'
Play with variations until you're comfortable with what you observe.
Given the compound command
cat tmp | head -1 > tmp
the left-hand process does the following (I've only listed steps that are relevant to my explanation):
- Execute the external program
cat
with the argumenttmp
. - Open
tmp
for reading. - While it hasn't reached the end of the file, read a chunk from the file and write it to standard output.
The right-hand process does the following:
- Redirect standard output to
tmp
, truncating the file in the process. - Execute the external program
head
with the argument-1
. - Read one line from standard input and write it to standard output.
The only point of synchronization is that right-3 waits for left-3 to have processed one full line. There is no synchronization between left-2 and right-1, so they can happen in either order. What order they happen in is not predictable: it depends on the CPU architecture, on the shell, on the kernel, on which cores the processes happen to be scheduled, on what interrupts the CPU receives around that time, etc.
How to change the behavior
You cannot change the behavior by changing a system setting. The computer does what you tell it to do. You told it to truncate tmp
and read from tmp
in parallel, so it does the two things in parallel.
Ok, there is one “system setting” you could change: you could replace /bin/bash
by a different program that is not bash. I hope it would go without saying that this is not a good idea.
If you want the truncation to happen before the left-hand side of the pipe, you need to put it outside of the pipeline, for example:
{ cat tmp | head -1; } >tmp
or
( exec >tmp; cat tmp | head -1 )
I have no idea why you'd want this though. What's the point in reading from a file that you know to be empty?
Conversely, if you want the output redirection (including the truncation) to happen after cat
has finished reading, then you need to either fully buffer the data in memory, e.g.
line=$(cat tmp | head -1)
printf %s "$line" >tmp
or write to a different file and then move it into place. This is usually the robust way to do things in scripts, and has the advantage that the file is written in full before it's visible through the original name.
cat tmp | head -1 >new && mv new tmp
The moreutils collection includes a program that does just that, called sponge
.
cat tmp | head -1 | sponge tmp
How to detect the issue automatically
If your goal was to take badly-written scripts and automatically figure out where they break, then sorry, life isn't that simple. Runtime analysis won't reliably find the problem because sometimes cat
finishes reading before the truncation happens. Static analysis can in principle do it; the simplified example in your question is caught by Shellcheck, but it may not catch a similar problem in a more complex script.
Gilles' answer explains the race condition. I'm just going to answer this part:
Is there any way I can force this script to output always 0 lines (so the I/O redirection to tmp is always prepared first and so the data is always destroyed)? To be clear, I mean changing the system settings
IDK if a tool for this already exists, but I have an idea for how one could be implemented. (But note this wouldn't be always 0 lines, just a useful tester that catches simple races like this easily, and some more complicated races. See @Gilles' comment.) It wouldn't guarantee that a script was safe, but might be a useful tool in testing, similar to testing a multi-threaded program on different CPUs, including weakly-ordered non-x86 CPUs like ARM.
You'd run it as racechecker bash foo.sh
Use the same system-call tracing / intercepting facilities that strace -f
and ltrace -f
use to attach to every child process. (On Linux, this is the same ptrace
system call used by GDB and other debuggers to set breakpoints, single step, and modify memory / registers of another process.)
Instrument the open
and openat
system calls: when any process running under this tool makes a an open(2)
system call (or openat
) with O_RDONLY
, sleep for maybe 1/2 or 1 second. Let other open
system calls (especially ones including O_TRUNC
) execute without delay.
This should allow the writer to win the race in nearly every race condition, unless system load was also high, or it was a complicated race condition where the truncation didn't happen until after some other read. So random variation of which open()
s (and maybe read()
s or writes) are delayed would increase the detection power of this tool, but of course without testing for an infinite amount of time with a delay simulator that will eventually cover all possible situations you can encounter in the real world, you can't be sure your scripts are free from races unless you read them carefully and prove they're not.
You would probably need it to whitelist (not delay open
) for files in /usr/bin
and /usr/lib
so process-startup doesn't take forever. (Runtime dynamic linking has to open()
multiple files (look at strace -eopen /bin/true
or /bin/ls
sometime), although if the parent shell itself is doing the truncation, that will be ok. But it will still be good for this tool to not make scripts unreasonably slow).
Or maybe whitelist every file the calling process doesn't have permission to truncate in the first place. i.e. the tracing process can make an access(2)
system call before actually suspending the process that wanted to open()
a file.
racechecker
itself would have to be written in C, not in shell, but could maybe use strace
's code as a starting point and might not take much work to implement.
You could maybe get the same functionality with a FUSE filesystem. There's probably a FUSE example of a pure passthrough filesystem, so you could add checks to the open()
function in that which make it sleep for read-only opens but let truncation happen right away.