Decode XML/HTML entities from iso-8859-1 charset in NodeJS

I'm receiving polish text from a SOAP action that has the polish diacritics encoded as XML entities, but as far as I can tell, they are not encoded in UTF-8 but ISO-8859-1 and I'm struggling to decode them properly in NodeJS.

Example text: Borek FaÅÄcki

Expected decoding result: Borek Fałęcki

Current result: Borek FaÅ‚Ä™cki

While I achieved the correct result in PHP using following code:

echo html_entity_decode('Borek Fa&#197;&#130;&#196;&#153;cki', ENT_QUOTES | ENT_SUBSTITUTE | ENT_XML1, 'ISO-8859-1');

I'm having no luck in doing the same in NodeJS. There aren't many complete packages to help with decoding html/xml entities, I have used both entites and html-entities but they provide the same results, and none of them seem to have any charset settings.

const { decode, encode } = require('html-entities');
const entities = require('entities');

const txt = 'Borek Fa&#197;&#130;&#196;&#153;cki';
console.log('html-entities decode', decode(txt));
console.log('utf8-encoding', encode('Borek Fałęcki', {
    mode: 'nonAsciiPrintable',
    numeric: 'decimal',
    level: 'xml',
}));
console.log('entities decode', entities.decodeXML(txt));

Output:

html-entities decode Borek FaÅ‚Ä™cki
utf8-encoding Borek Fa&#322;&#281;cki
entities decode Borek FaÅ‚Ä™cki

As we can see, when encoded with UTF-8 there are single entities for each character:

&#322; = ł
&#281; = ę

With ISO-8859-1, there are 2 entities per character. I have no more ideas how to achieve the same decoding result as in PHP. If there were no entities, I could just convert the encoding to UTF-8 but with entities I have no idea how to do it properly. I cannot get the other side to send me UTF-8, since this is a closed old protocol that I have no control of.

Solution

The correct XML encoding of Borek Fałęcki is Borek Fałęcki. The SOAP action XML that you receive is wrongly encoded.

However, the following expression converts it as needed:

Buffer.concat(
  "Borek Fa&#197;&#130;&#196;&#153;cki"
  .match(/[^&]+|&#\d+;/g)
  .map(c => c[0] === "&"
    ? Buffer.of(Number(c.substring(2, c.length - 1)))
    : Buffer.from(c))
).toString()