Search code examples
textunicodeencodingcharacter-encoding

How to determine encoding table of a text file


I have .txt and .java files and I don't know how to determine the encoding table of the files (Unicode, UTF-8, ISO-8525, …). Does there exist any program to determine the file encoding or to see the encoding?


Solution

  • If you're on Linux, try file -i filename.txt.

    $ file -i vol34.tex 
    vol34.tex: text/x-tex; charset=us-ascii
    

    For reference, here is my environment:

    $ which file
    /usr/bin/file
    $ file --version
    file-5.09
    magic file from /etc/magic:/usr/share/misc/magic
    

    Some file versions (e.g. file-5.04 on OS X/macOS) have slightly different command-line switches:

    $ file -I vol34.tex 
    vol34.tex: text/x-tex; charset=us-ascii
    $ file --mime vol34.tex
    vol34.tex: text/x-tex; charset=us-ascii
    

    Also, have a look here.