How do you use the command coproc in various shells?
co-processes are a ksh
feature (already in ksh88
). zsh
has had the feature from the start (early 90s), while it has just only been added to bash
in 4.0
(2009).
However, the behaviour and interface is significantly different between the 3 shells.
The idea is the same, though: it allows to start a job in background and being able to send it input and read its output without having to resort to named pipes.
That is done with unnamed pipes with most shells and socketpairs with recent versions of ksh93 on some systems.
In a | cmd | b
, a
feeds data to cmd
and b
reads its output. Running cmd
as a co-process allows the shell to be both a
and b
.
ksh co-processes
In ksh
, you start a coprocess as:
cmd |&
You feed data to cmd
by doing things like:
echo test >&p
or
print -p test
And read cmd
's output with things like:
read var <&p
or
read -p var
cmd
is started as any background job, You can use fg
, bg
, kill
on it and refer it by %job-number
or via $!
.
To close the writing end of the pipe cmd
is reading from, you can do:
exec 3>&p 3>&-
And to close the reading end of the other pipe (the one cmd
is writing to):
exec 3<&p 3<&-
You cannot start a second co-process unless you first save the pipe file descriptors to some other fds. For instance:
tr a b |&
exec 3>&p 4<&p
tr b c |&
echo aaa >&3
echo bbb >&p
zsh co-processes
In zsh
, co-processes are nearly identical to those in ksh
. The only real difference is that zsh
co-processes are started with the coproc
keyword.
coproc cmd
echo test >&p
read var <&p
print -p test
read -p var
Doing:
exec 3>&p
Note: This doesn't move the coproc
file descriptor to fd 3
(like in ksh
), but duplicates it. So, there's no explicit way to close the feeding or reading pipe, other starting another coproc
.
For instance, to close the feeding end:
coproc tr a b
echo aaaa >&p # send some data
exec 4<&p # preserve the reading end on fd 4
coproc : # start a new short-lived coproc (runs the null command)
cat <&4 # read the output of the first coproc
In addition to pipe based co-processes, zsh
(since 3.1.6-dev19, released in 2000) has pseudo-tty based constructs like expect
. To interact with most programs, ksh-style co-processes won't work, since programs start buffering when their output is a pipe.
Here are some examples.
Start the co-process x
:
zmodload zsh/zpty
zpty x cmd
(Here, cmd
is a simple command. But you can do fancier things with eval
or functions.)
Feed a co-process data:
zpty -w x some data
Read co-process data (in the simplest case):
zpty -r x var
Like expect
, it can wait for some output from the co-process matching a given pattern.
bash co-processes
The bash syntax is a lot newer, and builds on top of a new feature recently added to ksh93, bash, and zsh. It provides a syntax to allow handling of dynamically-allocated file descriptors above 10.
bash
offers a basic coproc
syntax, and an extended one.
Basic syntax
The basic syntax for starting a co-process looks like zsh
's:
coproc cmd
In ksh
or zsh
, the pipes to and from the co-process are accessed with >&p
and <&p
.
But in bash
, the file descriptors of the pipe from the co-process and the other pipe to the co-proccess are returned in the $COPROC
array (respectively ${COPROC[0]}
and ${COPROC[1]}
. So…
Feed data to the co-process:
echo xxx >&"${COPROC[1]}"
Read data from the co-process:
read var <&"${COPROC[0]}"
With the basic syntax, you can start only one co-process at the time.
Extended syntax
In the extended syntax, you can name your co-processes (like in zsh
zpty co-proccesses):
coproc mycoproc { cmd; }
The command has to be a compound command. (Notice how the example above is reminiscent of function f { ...; }
.)
This time, the file descriptors are in ${mycoproc[0]}
and ${mycoproc[1]}
.
You can start more than one co-process at a time—but you do get a warning when you start a co-process while one is still running (even in non-interactive mode).
You can close the file descriptors when using the extended syntax.
coproc tr { tr a b; }
echo aaa >&"${tr[1]}"
exec {tr[1]}>&-
cat <&"${tr[0]}"
Note that closing that way doesn't work in bash versions prior to 4.3 where you have to write it instead:
fd=${tr[1]}
exec {fd}>&-
As in ksh
and zsh
, those pipe file descriptors are marked as close-on-exec.
But in bash
, the only way to pass those to executed commands is to duplicate them to fds 0
, 1
, or 2
. That limits the number of co-processes you can interact with for a single command. (See below for an example.)
yash process and pipeline redirection
yash
doesn't have a co-process feature per se, but the same concept can be implemented with its pipeline and process redirection features. yash
has an interface to the pipe()
system call, so this kind of thing can be done relatively easily by hand there.
You'd start a co-process with:
exec 5>>|4 3>(cmd >&5 4<&- 5>&-) 5>&-
Which first creates a pipe(4,5)
(5 the writing end, 4 the reading end), then redirects fd 3 to a pipe to a process that runs with its stdin at the other end, and stdout going to the pipe created earlier. Then we close the writing end of that pipe in the parent which we won't need. So now in the shell we have fd 3 connected to the cmd's stdin and fd 4 connected to cmd's stdout with pipes.
Note that the close-on-exec flag is not set on those file descriptors.
To feed data:
echo data >&3 4<&-
To read data:
read var <&4 3>&-
And you can close fds as usual:
exec 3>&- 4<&-
Now, why they are not so popular
hardly any benefit over using named pipes
Co-processes can easily be implemented with standard named pipes. I don't know when exactly named pipes were introduced but it's possible it was after ksh
came up with co-processes (probably in the mid 80s, ksh88 was "released" in 88, but I believe ksh
was used internally at AT&T a few years before that) which would explain why.
cmd |&
echo data >&p
read var <&p
Can be written with:
mkfifo in out
cmd <in >out &
exec 3> in 4< out
echo data >&3
read var <&4
Interacting with those is more straightforward—especially if you need to run more than one co-process. (See examples below.)
The only benefit of using coproc
is that you don't have to clean up of those named pipes after use.
deadlock-prone
Shells use pipes in a few constructs:
- shell pipes:
cmd1 | cmd2
, - command substitution:
$(cmd)
, - and process substitution:
<(cmd)
,>(cmd)
.
In those, the data flows in only one direction between different processes.
With co-processes and named pipes, though, it's easy to run into deadlock. You have to keep track of which command has which file descriptor open, to prevent one staying open and holding a process alive. Deadlocks can be tricky to investigate, because they may occur non-deterministically; for instance, only when as much data as to fill one pipe up is sent.
works worse than expect
for what it's been designed for
The main purpose of co-processes was to provide the shell with a way to interact with commands. However, it does not work so well.
The simplest form of deadlock mentioned above is:
tr a b |&
echo a >&p
read var<&p
Because its output doesn't go to a terminal, tr
buffers its output. So it won't output anything until either it sees end-of-file on its stdin
, or it has accumulated a buffer-full of data to output. So above, after the shell has output a\n
(only 2 bytes), the read
will block indefinitely because tr
is waiting for the shell to send it more data.
In short, pipes aren't good for interacting with commands. Co-processes can only be used to interact with commands that don't buffer their output, or commands which can be told not to buffer their output; for example, by using stdbuf
with some commands on recent GNU or FreeBSD systems.
That's why expect
or zpty
use pseudo-terminals instead. expect
is a tool designed for interacting with commands, and it does it well.
File descriptor handling is fiddly, and hard to get right
Co-processes can be used to do some more complex plumbing than what simple shell pipes allow.
that other Unix.SE answer has an example of a coproc usage.
Here's a simplified example: Imagine you want a function that feeds a copy of a command's output to 3 other commands, and then have the output of those 3 commands get concatenated.
All using pipes.
For instance: feed the output of printf '%s\n' foo bar
to tr a b
, sed 's/./&&/g'
, and cut -b2-
to obtain something like:
foo
bbr
ffoooo
bbaarr
oo
ar
First, it's not necessarily obvious, but there’s a possibility for deadlock there, and it will start to happen after only a few kilobytes of data.
Then, depending on your shell, you’ll run in a number of different problems that have to be addressed differently.
For instance, with zsh
, you'd do it with:
f() (
coproc tr a b
exec {o1}<&p {i1}>&p
coproc sed 's/./&&/g' {i1}>&- {o1}<&-
exec {o2}<&p {i2}>&p
coproc cut -c2- {i1}>&- {o1}<&- {i2}>&- {o2}<&-
tee /dev/fd/$i1 /dev/fd/$i2 >&p {o1}<&- {o2}<&- &
exec cat /dev/fd/$o1 /dev/fd/$o2 - <&p {i1}>&- {i2}>&-
)
printf '%s\n' foo bar | f
Above, the co-process fds have the close-on-exec flag set, but not the ones that are duplicated from them (as in {o1}<&p
). So, to avoid deadlocks, you’ll have to make sure they're closed in any processes that don't need them.
Similarly, we have to use a subshell and use exec cat
in the end, to ensure there's no shell process lying about holding a pipe open.
With ksh
(here ksh93
), that would have to be:
f() (
tr a b |&
exec {o1}<&p {i1}>&p
sed 's/./&&/g' |&
exec {o2}<&p {i2}>&p
cut -c2- |&
exec {o3}<&p {i3}>&p
eval 'tee "/dev/fd/$i1" "/dev/fd/$i2"' >&"$i3" {i1}>&"$i1" {i2}>&"$i2" &
eval 'exec cat "/dev/fd/$o1" "/dev/fd/$o2" -' <&"$o3" {o1}<&"$o1" {o2}<&"$o2"
)
printf '%s\n' foo bar | f
(Note: That won’t work on systems where ksh
uses socketpairs
instead of pipes
, and where /dev/fd/n
works like on Linux.)
In ksh
, fds above 2
are marked with the close-on-exec flag, unless they’re passed explicitly on the command line. That’s why we don't have to close the unused file descriptors like with zsh
—but it’s also why we have to do {i1}>&$i1
and use eval
for that new value of $i1
, to be passed to tee
and cat
…
In bash
this cannot be done, because you can't avoid the close-on-exec flag.
Above, it's relatively simple, because we use only simple external commands. It gets more complicated when you want to use shell constructs in there instead, and you start running into shell bugs.
Compare the above with the same using named pipes:
f() {
mkfifo p{i,o}{1,2,3}
tr a b < pi1 > po1 &
sed 's/./&&/g' < pi2 > po2 &
cut -c2- < pi3 > po3 &
tee pi{1,2} > pi3 &
cat po{1,2,3}
rm -f p{i,o}{1,2,3}
}
printf '%s\n' foo bar | f
Conclusion
If you want to interact with a command, use expect
, or zsh
's zpty
, or named pipes.
If you want to do some fancy plumbing with pipes, use named pipes.
Co-processes can do some of the above, but be prepared to do some serious head scratching for anything non-trivial.
Co-processes were first introduced in a shell scripting language with the ksh88
shell (1988), and later in zsh
at some point before 1993.
The syntax to launch a co-process under ksh is command |&
. Starting from there, you can write to command
standard input with print -p
and read its standard output with read -p
.
More than a couple of decades later, bash which was lacking this feature finally introduced it in its 4.0 release. Unfortunately, an incompatible and more complex syntax was selected.
Under bash 4.0 and newer, you can launch a co-process with the coproc
command, eg:
$ coproc awk '{print $2;fflush();}'
You can then pass something to the command stdin that way:
$ echo one two three >&${COPROC[1]}
and read awk output with:
$ read -ru ${COPROC[0]} foo
$ echo $foo
two
Under ksh, that would have been:
$ awk '{print $2;fflush();}' |&
$ print -p "one two three"
$ read -p foo
$ echo $foo
two
Here is another good (and working) example -- a simple server written in BASH. Please note that you would need OpenBSD's netcat
, the classic one won't work. Of course you could use inet socket instead of unix one.
server.sh
:
#!/usr/bin/env bash
SOCKET=server.sock
PIDFILE=server.pid
(
exec </dev/null
exec >/dev/null
exec 2>/dev/null
coproc SERVER {
exec nc -l -k -U $SOCKET
}
echo $SERVER_PID > $PIDFILE
{
while read ; do
echo "pong $REPLY"
done
} <&${SERVER[0]} >&${SERVER[1]}
rm -f $PIDFILE
rm -f $SOCKET
) &
disown $!
client.sh
:
#!/usr/bin/env bash
SOCKET=server.sock
coproc CLIENT {
exec nc -U $SOCKET
}
{
echo "$@"
read
} <&${CLIENT[0]} >&${CLIENT[1]}
echo $REPLY
Usage:
$ ./server.sh
$ ./client.sh ping
pong ping
$ ./client.sh 12345
pong 12345
$ kill $(cat server.pid)
$