Why does `ls -l` count more files than me?
The 12
you see is not the number of files, but the number of disk blocks consumed.
From info coreutils 'ls invocation'
:
For each directory that is listed, preface the files with a line
`total BLOCKS', where BLOCKS is the total disk allocation for all
files in that directory. The block size currently defaults to 1024
bytes, but this can be overridden (*note Block size::). The
BLOCKS computed counts each hard link separately; this is arguably
a deficiency.
The total goes from 12
to 20
when you use ls -la
instead of ls -l
because you are counting two additional directories: .
and ..
. You are using four disk blocks for each (empty) directory, so your total goes from 3 × 4 to 5 × 4. (In all likelihood, you are using one disk block of 4096 bytes for each directory; as the info
page indicates, the utility does not check the disk format, but assumes a block size of 1024
unless instructed otherwise.)
If you want to simply get the number of files, you might try something like
ls | wc -l
user4556274 has already answered the why. My answer serves only to provide additional information for how to properly count files.
In the Unix community the general consensus is that parsing the output of ls
is a very very bad idea, since filenames can contain control characters or hidden characters. For example, due to a newline character in a filename, we have ls | wc -l
tell us there's 5 lines in the output of ls
(which it does have), but in reality there's only 4 files in the directory.
$> touch FILE$'\n'NAME
$> ls
file1.txt file2.txt file3.txt FILE?NAME
$> ls | wc -l
5
Method #1: find utility
The find
command, which is typically used for working around parsing filenames, can help us here by printing the inode number. Be it a directory or a file, it only has one unique inode number. Thus, using -printf "%i\n"
and excluding .
via -not -name "."
we can have an accurate count of the files. (Note the use of -maxdepth 1
to prevent recursive descending into subdirectories)
$> find -maxdepth 1 -not -name "." -print
./file2.txt
./file1.txt
./FILE?NAME
./file3.txt
$> find -maxdepth 1 -not -name "." -printf "%i\n" | wc -l
4
Method #2 : globstar
Simple, quick, and mostly portable way:
$ set -- *
$ echo $#
228
set
command is used to set positional parameters of the shell ( the $<INTEGER>
variables, as in echo $1
). This is often used to work around /bin/sh
limitation of lacking arrays. A version that performs extra checks can be found in Gille's answer over on Unix&Linux.
In shells that support arrays, such as bash
, we can use
items=( dir/* )
echo ${#items[@]}
as proposed by steeldriver in the comments.
Similar trick to find
method which used wc
and globstar can be used with stat
to count inode numbers per line:
$> LC_ALL=C stat ./* --printf "%i\n" | wc -l
4
An alternative approach is to use a wildcard in for
loop. (Note, this test uses a different directory to test whether this approach descends into subdirectories, which it does not - 16 is the verified number of items in my ~/bin
)
$> count=0; for item in ~/bin/* ; do count=$(($count+1)) ; echo $count ; done | tail -n 1
16
Method #3: other languages/interpreters
Python can also deal with problematic filenames via printing the length of a list given my os.listdir()
function (which is non-recursive, and will only list items in the directory given as argument).
$> python -c "import os ; print os.listdir('.')"
['file2.txt', 'file1.txt', 'FILE\nNAME', 'file3.txt']
$> python -c "import os ; print(len(os.listdir('.')))"
4
See also
- What's the most resource efficient way to count how many files are in a directory?