The process substitution output is out of the order

Yes, in bash like in ksh (where the feature comes from), the processes inside the process substitution are not waited for (before running the next command in the script).

for a <(...) one, that's usually fine as in:

cmd1 <(cmd2)

the shell will be waiting for cmd1 and cmd1 will be typically waiting for cmd2 by virtue of it reading until end-of-file on the pipe that is substituted, and that end-of-file typically happens when cmd2 dies. That's the same reason several shells (not bash) don't bother waiting for cmd2 in cmd2 | cmd1.

For cmd1 >(cmd2), however, that's generally not the case, as it's more cmd2 that typically waits for cmd1 there so will generally exit after.

That's fixed in zsh that waits for cmd2 there (but not if you write it as cmd1 > >(cmd2) and cmd1 is not builtin, use {cmd1} > >(cmd2) instead as documented).

ksh doesn't wait by default, but lets you wait for it with the wait builtin (it also makes the pid available in $!, though that doesn't help if you do cmd1 >(cmd2) >(cmd3))

rc (with the cmd1 >{cmd2} syntax), same as ksh except you can get the pids of all the background processes with $apids.

es (also with cmd1 >{cmd2}) waits for cmd2 like in zsh, and also waits for cmd2 in <{cmd2} process redirections.

bash does make the pid of cmd2 (or more exactly of the subshell as it does run cmd2 in a child process of that subshell even though it's the last command there) available in $!, but doesn't let you wait for it.

If you do have to use bash, you can work around the problem by using a command that will wait for both commands with:

{ { cmd1 >(cmd2); } 3>&1 >&4 4>&- | cat; } 4>&1

That makes both cmd1 and cmd2 have their fd 3 open to a pipe. cat will wait for end-of-file at the other end, so will typically only exit when both cmd1 and cmd2 are dead. And the shell will wait for that cat command. You could see that as a net to catch the termination of all background processes (you can use it for other things started in background like with &, coprocs or even commands that background themselves provided they don't close all their file descriptors like daemons typically do).

Note that thanks to that wasted subshell process mentioned above, it works even if cmd2 closes its fd 3 (commands usually don't do that, but some like sudo or ssh do). Future versions of bash may eventually do the optimisation there like in other shells. Then you'd need something like:

{ { cmd1 >(sudo cmd2; exit); } 3>&1 >&4 4>&- | cat; } 4>&1

To make sure there's still an extra shell process with that fd 3 open waiting for that sudo command.

Note that cat won't read anything (since the processes don't write on their fd 3). It's just there for synchronisation. It will do just one read() system call that will return with nothing at the end.

You can actually avoid running cat by using a command substitution to do the pipe synchronisation:

{ unused=$( { cmd1 >(cmd2); } 3>&1 >&4 4>&-); } 4>&1

This time, it's the shell instead of cat that is reading from the pipe whose other end is open on fd 3 of cmd1 and cmd2. We're using a variable assignment so the exit status of cmd1 is available in $?.

Or you could do the process substitution by hand, and then you could even use your system's sh as that would become standard shell syntax:

{ cmd1 /dev/fd/3 3>&1 >&4 4>&- | cmd2 4>&-; } 4>&1

though note as noted earlier that not all sh implementations would wait for cmd1 after cmd2 has finished (though that's better than the other way round). That time, $? contains the exit status of cmd2; though bash and zsh make cmd1's exit status available in ${PIPESTATUS[0]} and $pipestatus[1] respectively (see also the pipefail option in a few shells so $? can report the failure of pipe components other than the last)

Note that yash has similar issues with its process redirection feature. cmd1 >(cmd2) would be written cmd1 /dev/fd/3 3>(cmd2) there. But cmd2 is not waited for and you can't use wait to wait for it either and its pid is not made available in the $! variable either. You'd use the same work arounds as for bash.

You can pipe the second command into another cat, which will wait until its input pipe closes. Ex:

prompt$ echo one; echo two > >(cat) | cat; echo three;
one
two
three
prompt$

Short and simple.

==========

As simple as it seems, a lot is going on behind the scenes. You can ignore the rest of the answer if you aren't interested in how this works.

When you have echo two > >(cat); echo three, >(cat) is forked off by the interactive shell, and runs independently of echo two. Thus, echo two finishes, and then echo three gets executed, but before the >(cat) finishes. When bash gets data from >(cat) when it didn't expect it (a couple of milliseconds later), it gives you that prompt-like situation where you have to hit newline to get back to the terminal (Same as if another user mesg'ed you).

However, given echo two > >(cat) | cat; echo three, two subshells are spawned (as per the documentation of the | symbol).

One subshell named A is for echo two > >(cat), and one subshell named B is for cat. A is automatically connected to B (A's stdout is B's stdin). Then, echo two and >(cat) begin executing. >(cat)'s stdout is set to A's stdout, which is equal to B's stdin. After echo two finishes, A exits, closing its stdout. However, >(cat) is still holding the reference to B's stdin. The second cat's stdin is holding B's stdin, and that cat will not exit until it sees an EOF. An EOF is only given when no one has the file open in write mode anymore, so >(cat)'s stdout is blocking the second cat. B remains waiting on that second cat. Since echo two exited, >(cat) eventually gets an EOF, so >(cat) flushes its buffer and exits. No one is holding B's/second cat's stdin anymore, so the second cat reads an EOF (B isn't reading its stdin at all, it doesn't care). This EOF causes the second cat to flush its buffer, close its stdout, and exit, and then B exits because cat exited and B was waiting on cat.

A caveat of this is that bash also spawns a subshell for >(cat)! Because of this, you'll see that

echo two > >(sleep 5) | cat; echo three

will still wait 5 seconds before executing echo three, even though sleep 5 isn't holding B's stdin. This is because a hidden subshell C spawned for >(sleep 5) is waiting on sleep, and C is holding B's stdin. You can see how

echo two > >(exec sleep 5) | cat; echo three

Will not wait however, since sleep isn't holding B's stdin, and there's no ghost subshell C that's holding B's stdin (exec will force sleep to replace C, as opposed to forking and making C wait on sleep). Regardless of this caveat,

echo two > >(exec cat) | cat; echo three

will still properly execute the functions in order, as described previously.

The process substitution output is out of the order

Tags:

Bash

Process Substitution

Related

Recent Posts