Understanding piped commands in Unix/Linux
The only thing about your question that stands out as wrong is that you say
A would run first, then B gets the stdout of A
In fact, both programs would be started at pretty much the same time. If there's no input for B
when it tries to read, it will block until there is input to read. Likewise, if there's nobody reading the output from A
, its writes will block until its output is read (some of it will be buffered by the pipe).
The only thing synchronising the processes that take part in a pipeline is the I/O, i.e. the reading and writing across the pipe. If no writing or reading happens, then the two processes will run totally independent of each other. If one ignores the reading or writing of the other, the ignored process will block and eventually be killed by a SIGPIPE
signal (if writing) or get an end-of-file condition on its standard input stream (if reading) when the other process terminates.
The idiomatic way to describe A | B
is that it's a pipeline containing two programs. The output produced on standard output from the first program is available to be read on the standard input by the second ("[the output of] A
is piped into [the input of] B
"). The shell does the required plumbing to allow this to happen.
If you want to use the words "consumer" and "producer", I suppose that's ok too.
The fact that these are programs written in C is not relevant. The fact that this is Linux, macOS, OpenBSD or AIX is not relevant.
The term usually used in documentation is "pipeline" , which consists of one or more commands, see POSIX definition So technically speaking, that's two commands you have there, two subprocesses for the shell (either fork()+exec()
'ed external commands or subshells )
As for producer-consumer part, the pipeline can be described by that pattern, since:
- Producer and Consumer share fixed-size buffer, and at least on Linux and MacOS X, there's fixed size for pipeline buffer
- Producer and Consumer are loosely-coupled, commands in pipeline don't know of each other's existence ( unless they are actively checking
/proc/<pid>/fd
directory ). - Producers write to
stdout
and consumers readstdin
as if they were a single command being executed, aka they can exist without each other.
The difference I see here is that unlike Producer-Consumer in other languges, shell commands use buffering and they write stdout once buffer is filled, but there's no mention that Producer-Consumer has to follow that rule - only wait when queue is filled or discard data (which is something else that pipeline doesn't do).