How to find encoding of a file via script on Linux?

Sounds like you're looking for enca. It can guess and even convert between encodings. Just look at the man page.

Or, failing that, use file -i (linux) or file -I (osx). That will output MIME-type information for the file, which will also include the character-set encoding. I found a man-page for it, too :)

file -bi <file name>

If you like to do this for a bunch of files

for f in `find | egrep -v Eliminate`; do echo "$f" ' -- ' `file -bi "$f"` ; done

uchardet - An encoding detector library ported from Mozilla.

Usage:

~> uchardet file.java 
UTF-8

Various Linux distributions (Debian/Ubuntu, OpenSuse-packman, ...) provide binaries.

How to find encoding of a file via script on Linux?

Tags:

Unix

Shell

File

Encoding

Related

Recent Posts