IO redirection and the head command
When the shell gets a command line like: command > file.out
the shell itself opens (and maybe creates) the file named file.out
. The shell sets file descriptor 0 to the file file descriptor it got from the open. That's how I/O redirection works: every process knows about file descriptors 0, 1 and 2.
The hard part about this is how to open file.out
. Most of the time, you want file.out
opened for write at offset 0 (i.e. truncated) and this is what the shell did for you. It truncated .hgignore, opened it for write, dup'ed the filedescriptor to 0, then exec'ed head
. Instant file clobbering.
In bash shell, you do a set noclobber
to change this behavior.
I think Bruce answers what's going on here with the shell pipeline.
One of my favorite little utilities is the sponge
command from moreutils. It solves exactly this problem by "soaking" up all available input before it opens the target output file and writing the data. It allows you to write pipelines exactly how you expected to:
$ head -1 .hgignore | sponge .hgignore
The poor-man's solution is to pipe the output to a temporary file, then after the pipline is done (for example the next command you run) is to move the temp file back to the original file location.
$ head -1 .hgingore > .hgignore.tmp
$ mv .hgignore{.tmp,}
In
head -n 1 file > file
file
is truncated before head
is started, but if you write it:
head -n 1 file 1<> file
it's not as file
is opened in read-write mode. However, when head
finishes writing, it doesn't truncate the file, so the line above would be a no-op (head
would just rewrite the first line over itself and leave the other ones untouched).
However, after head
has returned and while the fd
is still open, you can call another command that does the truncate
.
For instance:
{ head -n 1 file; perl -e 'truncate STDOUT, tell STDOUT'; } 1<> file
What matters here is that truncate
above, head
just moves the cursor for fd 1 inside the file just after the first line. It does rewrite the first line which we didn't need it to, but that's not harmful.
With a POSIX head, we could actually get away without rewriting that first line:
{ head -n 1 > /dev/null
perl -e 'truncate STDIN, tell STDIN'
} <> file
Here, we're using the fact that head
moves the cursor position in its stdin. While head
would typically read its input by big chunks to improve performance, POSIX would require it (where possible) to seek
back just after the first line if it had gone beyond it. Note however that not all implementations do it.
Alternatively, you can use the shell's read
command instead in this case:
{ read -r dummy; perl -e 'truncate STDIN, tell STDIN'; } <> file