What could cause the file command in Linux to report a text file as binary data?

Vim tries very hard to make sense of whatever you throw at it without complaining. This makes it a relatively poor tool to use to diagnose file's output.

Vim's "[converted]" notice indicates there was something in the file that vim wouldn't expect to see in the text encoding suggested by your locale settings (LANG etc).

Others have already suggested

  • cat -v
  • xxd

You could try grepping for non-ASCII characters.

  • grep -P '[\x7f-\xff]' filename

The other possibility is non-standard line-endings for the platform (i.e. CRLF or CR) but I'd expect file to cope with that and report "DOS text file" or similar.


If you run file -D filename, file displays debugging information, including the tests it performs. Near the end, it will show what test was successful in determining the file type.

For a regular text file, it looks like this:

[31> 0 regex,=^package[ \t]+[0-9A-Za-z_:]+ *;,""]
1 == 0 = 0
ascmagic 1
filename.txt: ISO-8859 text, with CRLF line terminators

This will tell you what it found to determine it's that mime type.


I found the issue using binary search to locate the problematic lines.

head -n {1/2 line count} file.cpp > a.txt
tail -n {1/2 line count} file.cpp > b.txt

Running file against each half, and repeating the process, helped me locate the offending line. I found a Control+P (^P) character embedded in it. Removing it solved the problem. I'll write myself a Perl script to search for these characters (and other extended) in the future.

A big thanks to everyone who provided an answer for all the tips!