How to tell binary from text files in linux
file
is still the command you want. Any file that is text (according to its heuristics) will include the word "text" in the output of file
; anything that is binary will not include the word "text".
If you don't agree with the heuristics that file
uses to determine text vs. not-text, then the question needs to be better specified, since text vs. non-text is an inherently vague question. For example, file
does not identify a PGP public key block in ASCII as "text", but you might (since it is composed only of printable characters, even though it is not human-readable).
The diff manual specifies that
diff determines whether a file is text or binary by checking the first few bytes in the file; the exact number of bytes is system dependent, but it is typically several thousand. If every byte in that part of the file is non-null, diff considers the file to be text; otherwise it considers the file to be binary.