Why doesn't the command "ls | file" work?
The fundamental issue is that file
expects file names as command-line arguments, not on stdin. When you write ls | file
the output of ls
is being passed as input to file
. Not as arguments, as input.
What's the difference?
Command-line arguments are when you write flags and file names after a command, as in
cmd arg1 arg2 arg3
. In shell scripts these arguments are available as the variables$1
,$2
,$3
, etc. In C you'd access them via thechar **argv
andint argc
arguments tomain()
.Standard input, stdin, is a stream of data. Some programs like
cat
orwc
read from stdin when they're not given any command-line arguments. In a shell script you can useread
to get a single line of input. In C you can usescanf()
orgetchar()
, among various options.
file
does not normally read from stdin. It expects at least one file name to be passed as an argument. That's why it prints out usage when you write ls | file
, because you didn't pass an argument.
You could use xargs
to convert stdin into arguments, as in ls | xargs file
. Still, as terdon mentions, parsing ls
is a bad idea. The most direct way to do this is simply:
file *
Because, as you say, the input of file
has to be filenames. The output of ls
, however, is just text. That it happens to be a list of file names doesn't change the fact that it is simply text and not the location of files on the hard drive.
When you see output printed on the screen, what you see is text. Whether that text is a poem or a list of filenames makes no difference to the computer. All it knows is that it is text. This is why you can pass the output of ls
to programs that take text as input (although you really, really shouldn't):
$ ls / | grep etc
etc
So, to use the output of a command that lists file names as text (such as ls
or find
) as input for a command that takes filenames, you need to use some tricks. The typical tool for this is xargs
:
$ ls
file1 file2
$ ls | xargs wc
9 9 38 file1
5 5 20 file2
14 14 58 total
As I said before, though, you really don't want to be parsing the output of ls
. Something like find
is better (the print0
prints a \0
instead of a newilne after each file name and the -0
of xargs
lets it deal with such input; this is a trick to make your commands work with filenames containing newlines):
$ find . -type f -print0 | xargs -0 wc
9 9 38 ./file1
5 5 20 ./file2
14 14 58 total
Which also has its own way of doing this, without needing xargs
at all:
$ find . -type f -exec wc {} +
9 9 38 ./file1
5 5 20 ./file2
14 14 58 total
Finally, you can also use a shell loop. However, note that in most cases, xargs
will be much faster and more efficient. For example:
$ for file in *; do wc "$file"; done
9 9 38 file1
5 5 20 file2
learned that '|' (pipeline) is meant to redirect the output from a command to the input of another one.
It doesn't "redirect" the output, but takes the output of a program and use it as input, while file doesn't take inputs but filenames as arguments, which are then tested. Redirections do not pass these filenames as arguments neither piping does, the later what you are doing.
What you can do is read the filenames from a file with the --files-from
option if you have a file which list all files you want to test, otherwise just pass the paths to your files as arguments.