I am trying to decode this HTML page using Node.js with Request module: http://www.receita.fazenda.gov.br/PessoaJuridica/CNPJ/cnpjreva/Cnpjreva_Erro.asp
javascript console returns the charset windows-1252:
document.characterSet = "windows-1252";
I tried using all avaliable encodings in iconv-lite but all return the wrong text.
var body = iconv.decode(new Buffer(body), "windows1252");
Anyone have any idea how to decode this page?
Sample code:
request('http://www.receita.fazenda.gov.br/PessoaJuridica/CNPJ/cnpjreva/Cnpjreva_Erro.asp', function (err, res, body) {
var body = iconv.decode(new Buffer(body), "windows1252");
console.log(body);
});
Returns:
...
<td valign="middle" align="left"><b><font face="Arial" size="2">
Acesso n�o permitido.
</td>
...
Decoded string should be:
...
<td valign="middle" align="left"><b><font face="Arial" size="2">
Acesso não permitido.
</td>
...
Thanks.
The encoding the page returns using document.characterSet is wrong, the correct encoding is ISO-8859-1
body = iconv.decode(body, "ISO-8859-1");