Search code examples
rubychardet

rchardet gem support for ISO-8859-1, and Windows-1252


I would like to know whether rchardet supports encoding for ISO-8859-1, and Windows-1252. I have seen the documentation but I didn't get proper info on this.


Solution

  • I didn't know the chardet gem but found your answer after 5 minutes of code reading on https://github.com/jmhodges/rchardet.

    • UTF-8 is supported by various probers, including UniversalDetector and UTF8Prober
    • ASCII is a subset of UTF-8, Latin-1 and others. It is directly detected by the UniversalProber.
    • ISO-8859-1 (also known as Latin-1) is supported by the Latin1Prober
    • Windows-1252 which is very similar to Latin-1 (and this depending on the data not differentiable) is also supported by the Latin1Prober.

    As for your general question, you should get into the habit of reading the code of the gems you use instead of jus the documentation. Not only helps it to understand what the gem does, it also tremendously helps to understand how it does this and what it does well and what not. Also, reading code makes you always a better programmer.