Does tee slow down pipelines?
Yes, it slows things down. And it basically does have a queue of unwritten data, though that's actually maintained by the kernel—all programs have that, unless they explicitly request otherwise.
For example, here is a trivial pipe using pv
, which is nice because it displays transfer rate:
$ pv -s 50g -S -pteba /dev/zero | cat > /dev/null
50GiB 0:00:09 [ 5.4GiB/s] [===============================================>] 100%
Now, let's add a tee
in there, not even writing an extra copy—just forwarding it along:
$ pv -s 50g -S -pteba /dev/zero | tee | cat > /dev/null
50GiB 0:00:20 [2.44GiB/s] [===============================================>] 100%
So, that's quite a bit slower, and it wasn't even doing anything! That's the overhead of tee internally copying STDIN to STDOUT. (Interestingly, adding a second pv
in there stays at 5.19GiB/s, so pv
is substantially faster than tee
. pv
uses splice(2)
, tee
likely does not.)
Anyway, let's see what happens if I tell tee
to write to a file on disk. It starts out fairly fast (~800MiB/s) but as it goes on, it keeps slowing down—ultimately down to ~100MiB/s, which is basically 100% of the disk write bandwidth. (The fast start is due to the kernel caching the disk write, and the slowdown to disk write speed is the kernel refusing to let the cache grow infinitely.)
Does it matter?
The above is a worst-case. The above uses a pipe to spew data as fast as possible. The only real-world use I can think of like this is piping raw YUV data to/from ffmpeg
.
When you're sending data at slower rates (because you're processing them, etc.) it's going to be a much less significant effect.
Nothing surprising here, after all
> POSIX says,
DESCRIPTION
The tee utility shall copy standard input to standard output, making a copy in zero or more files. The tee utility shall not buffer output.
And also that
RATIONALE
The buffering requirement means that tee is not allowed to use ISO C standard fully buffered or line-buffered writes. It does not mean that tee has to do 1-byte reads followed by 1-byte writes.
So, without explaining "rationale", tee
will probably only read and write up to however many bytes can fit into your pipe buffer at a time, flushing the output on every single write.
And yes, depending on the application, this can be rather inefficient — so feel free to simply remove/comment any of these out:
https://github.com/coreutils/coreutils/blob/master/src/tee.c#L208
https://github.com/coreutils/coreutils/blob/master/src/tee.c#L224