Do file descriptors optimise writing to files?
The main difference between opening the file before the loop with exec
, and putting the redirection in the command in the loop is that the former requires setting up the file descriptor just once, while the latter opens and closes the file for each iteration of the loop.
Doing it once is likely to be more efficient, but if you were to run an external command inside the loop, the difference would probably disappear in the cost of launching the command. (echo
here is probably builtin, so that doesn't apply)
If the output is going to be sent to something other than a regular file (e.g. if x
is a named pipe), the act of opening and closing the file may be visible to other processes, so there may be differences in behaviour, too.
Note that there's really no difference between a redirection through exec
and a redirection on the command, they both open the file and juggle file descriptor numbers.
These two should be pretty much equivalent, in that they both open()
the file and write()
to it. (There's differences in how fd 1 is stored for the duration of the command, though.):
for i in {1..1000}; do
>>x echo "$i"
done
for i in {1..1000}; do
exec 3>&1 1>>x # assuming fd 3 is available
echo "$i" # here, fd 3 is visible to the command
exec 1>&3 3>&-
done
Yes, it is more efficient
Easiest way to test is to increase count to say 500000 and time it:
> time bash s1.sh; time bash s2.sh
bash s1.sh 16,47s user 10,00s system 99% cpu 26,537 total
bash s2.sh 10,51s user 3,50s system 99% cpu 14,008 total
strace(1) reveals why (we have one simple write
, instead of open
+5*fcntl
+2*dup
+2*close
+write
):
for for i in {1..1000}; do >>x echo "$i"; done
we get:
open("x", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
fcntl(1, F_GETFD) = 0
fcntl(1, F_DUPFD, 10) = 10
fcntl(1, F_GETFD) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
write(1, "997\n", 4) = 4
dup2(10, 1) = 1
fcntl(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(10) = 0
open("x", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
fcntl(1, F_GETFD) = 0
fcntl(1, F_DUPFD, 10) = 10
fcntl(1, F_GETFD) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
write(1, "998\n", 4) = 4
dup2(10, 1) = 1
fcntl(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(10) = 0
open("x", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
fcntl(1, F_GETFD) = 0
fcntl(1, F_DUPFD, 10) = 10
fcntl(1, F_GETFD) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
write(1, "999\n", 4) = 4
dup2(10, 1) = 1
fcntl(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(10) = 0
open("x", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
fcntl(1, F_GETFD) = 0
fcntl(1, F_DUPFD, 10) = 10
fcntl(1, F_GETFD) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
write(1, "1000\n", 5) = 5
dup2(10, 1) = 1
fcntl(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(10) = 0
while for exec 3>&1 1>x
we get much cleaner
write(1, "995\n", 4) = 4
write(1, "996\n", 4) = 4
write(1, "997\n", 4) = 4
write(1, "998\n", 4) = 4
write(1, "999\n", 4) = 4
write(1, "1000\n", 5) = 5
But note that the difference is not due to "using a FD", but because of place where you do redirection. For example, if you were to do for i in {1..1000}; do echo "$i"; done > x
you would get pretty much the same performance as your second example:
bash s3.sh 10,35s user 3,70s system 100% cpu 14,042 total