When should I use input redirection?

From the man grep page (on Debian):

DESCRIPTION

   grep  searches the named input FILEs (or standard input if no files are
   named, or if a single hyphen-minus (-) is given as file name) for lines
   containing  a  match to the given PATTERN.  By default, grep prints the
   matching lines.

In the first case, grep opens the file; in the second, the shell opens the file and assigns it to the standard input of grep, and grep not being passed any file name argument assumes it needs to grep its standard input.

Pros of 1:

grep can grep more than one file¹.
grep can display the file name where each occurrence of line is found.
grep could² (but I don't know of any implementation that does) do a fadvise(POSIX_FADV_SEQUENTIAL) on the file descriptors it opens.

Pros of 2:

If the file can't be opened, the shell returns an error which will include more relevant information (like line number in the script) and in a more consistent way (if you let the shell open files for other commands as well) than when grep opens it. And if the file can't be opened, grep is not even called (which for some commands -- maybe not grep -- can make a big difference).
in grep line < in > out, if in can't be opened, out won't be created or truncated.
There's no problem with some files with unusual names (like - or file names starting with -)³.
cosmetic: you can put <file anywhere on the command-line to show the command flow more naturally, like <in grep line >out if you prefer.
cosmetic: with GNU grep, you can choose what label to use in front of the matching line instead of just the file name as in:
```
 <file grep --label='Found in file at line' -Hn line
```

In terms of performance, if the file can't be opened, you save the execution of grep when using redirection, but otherwise for grep I don't expect much difference.

With redirection, you save having to pass an extra argument to grep, you make grep's argument parsing slightly easier. On the other hand, the shell will need (at least) an extra system call to dup2() the file descriptor onto file descriptor 0.

In { grep -m1 line; next command; } < file, grep (here GNU grep) will want to seek() back to just after the matching line so the next command sees the rest of the file (it will also need to determine whether the file is seekable or not). In other words, the position within stdin is another one of grep's output. With grep -m1 line file, it can optimise that out, that's one fewer thing for grep to care about.

Notes

¹ With zsh, you can do:

grep line < file1 < file2

but that's doing the equivalent of cat file1 file2 | grep line (without invoking the cat utility) and so is less efficient, can cause confusion if the first file doesn't end in a newline character and won't let you know in which file the pattern is found.

² That is to tell the system that grep is going to read the file sequentially so the I/O scheduler can make more educated decisions for instance as to how to read the data. grep can do that on its own fd, but it would be wrong to do it on that fd 0 that it borrows from its caller, as that fd (or rather the open file description it references) could be used later or even at the same time for non-sequential read.

³ In the case of ksh93 and bash though, there are files like /dev/tcp/host/port (and /dev/fd/x on some systems in bash) which, when used in the target of redirections the shell intercepts for special purposes instead of really opening the file on the file system (though generally, those files don't exist on the file system). /dev/stdin serves the same purpose as - recognised by grep, but at least, here it's more properly namespaced (anybody can create a file called - in any directory, while only administrators can create a file called /dev/tcp/host/port and administrators should know better).

The answer by StephaneChazelas covers grep(1), and most Unix lineage commands work that way, but not all. It is standard to read either from standard input (from the keyboard, from a file redirected via < file, or from the output piped by another command, stupid example ls * | grep '^ab*c$'), or from the file(s) given as arguments, like grep comment file1 file2 file3. Some commands use the convention there that the file named - is standard input, so you can say make-middle | cat head - tail to get a stream with head, whatever gen-middle generates, followed by tail. This is by design, to give flexibility in the use of the commands.

Which is better? As long as it works, cmd file is shorter than cmd < file; there could be a tiny difference in time between the shell doing the file frobbing (<) and the command doing it by itself, but probably unnoticeable unless you do nothing else all day long. It will depend on considerations like the pros mentioned in Stephane's answer.

When should I use input redirection?

Pros of 1:

Pros of 2:

Notes

Tags:

Io Redirection

Related

Recent Posts