what is the most reliable command to find actual size of a file linux

There is no inaccuracy or reliability issue here, you're just comparing two different numbers: logical size vs physical size.

Here's Wikipedia's illustration for sparse files:

Elaborate illustration better explained by the Wikipedia Article

ls shows the gray+green areas, the logical length of the file. du (without --apparent-size) shows only the green areas, since those are the ones that take up space.

You can create a sparse file with dd count=0 bs=1M seek=100 of=myfile.

ls shows 100MiB because that's how long the file is:

$ ls -lh myfile
-rw-r----- 1 me me 100M Jul 15 10:57 myfile

du shows 0, because that's how much data it's allocated:

$ du myfile
0 myfile

ls -l --block-size=M

will give you a long format listing (needed to actually see the file size) and round file sizes up to the nearest MiB.

If you want MB (10^6 bytes) rather than MiB (2^20 bytes) units, use --block-size=MB instead.

If you don't want the M suffix attached to the file size, you can use something like --block-size=1M. Thanks Stéphane Chazelas for suggesting this.

This is described in the man page for ls; man ls and search for SIZE. It allows for units other than MB/MiB as well, and from the looks of it (I didn't try that) arbitrary block sizes as well (so you could see the file size as number of 412-byte blocks, if you want to).

Note that the --block-size parameter is a GNU extension on top of the Open Group's ls, so this may not work if you don't have a GNU userland (which most Linux installations do). The ls from GNU coreutils 8.5 does support --block-size as described above.


There are several notions of file size, as explained in that other guiy's answer and the wikipage figure on sparse files.

However, you might want to use both ls(1) & stat(1) commands.

If coding in C, consider using stat(2) & lseek(2) syscalls.

See also the references in this answer.

Tags:

Linux

Unix