How can I decode ASCII characters from strings?

I am parsing a binary file to extract the text content. This is for a library that can be run in either a Node environment or a web browser. I need to convert all characters to be the human-readable versions of the encoding. So I receive an example string like

'SeÃ±or and salvaciÃ³n and Number%3A 1234%3B %06%88'

and I expect the output to be

'Señor and salvación and Number: 1234; ♠'

Currently I am using a mixture of decoding and escaping strings using a function I found on another SO question. I am absolutely OK with throwing it away in favor of something else that works better. I know what I am doing is not ideal at all, but I am unsure of what I need to do to make this work correctly. The below example shows that function and the steps to get to final output which is close, but not perfect.

The other problem is that using decodeURIComponent will sometimes throw a URIError: URI malformed error depending on what kinds of input I give it

function escapeString(str) {
  //A replacement for the deprecated escape method
  //https://stackoverflow.com/a/37303214/79677
  const allowed = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789@*_+-./,';
  str = str.toString();
  const len = str.length;
  let R = '';
  let k = 0;
  let S = '';
  let chr = '';
  let ord = 0;
  while (k < len) {
    chr = str[k];
    if (allowed.indexOf(chr) !== -1) {
      S = chr;
    } else {
      ord = str.charCodeAt(k);
      if (ord < 256) {
        S = '%' + ('00' + ord.toString(16)).toUpperCase().slice(-2);
      } else {
        S = '%u' + ('0000' + ord.toString(16)).toUpperCase().slice(-4);
      }
    }
    R += S;
    k++;
  }
  return R;
}

const str = 'SeÃ±or and salvaciÃ³n and Number%3A 1234%3B %06%88';
//Expecting: 'Señor and salvación and Number: 1234; ♠'

console.log(1, str);
console.log(2, escapeString(str))
console.log(3, decodeURIComponent(escapeString(str)));
console.log(4, unescape(decodeURIComponent(escapeString(str))));

How can I properly, correctly, and consistently decode/convert my strings to the human-readable versions?

Solution

You face a (mix of) mojibake case (example in Python for its universal intelligibility):

from urllib.parse import unquote
string = 'SeÃ±or and salvaciÃ³n and Number%3A 1234%3B';
unquote( string).encode( 'cp1252').decode( 'utf-8');

'Señor and salvación and Number: 1234;'

Rewritten to JavaScript (sorry for lame and dull-witted code):

function byteToUint8Array(byteArray) {
    // https://stackoverflow.com/a/34821126/3439404
    var uint8Array = new Uint8Array(byteArray.length);
    for(var i = 0; i < uint8Array.length; i++) {
        uint8Array[i] = byteArray[i];
    }

    return uint8Array;
};
function getBytes(txt) {
    // 
    var bytes = [];
    for (var i = 0; i < txt.length; ++i) {
        bytes.push(txt.charCodeAt(i));
    }
    return byteToUint8Array(bytes);
};

var decoder = new TextDecoder("utf-8");
const str = 'SeÃ±or and salvaciÃ³n and Number%3A 1234%3B' // %06%88';
//Expecting: 'Señor and salvación and Number: 1234;' // ♠'

console.log(1, str);
console.log(5, decoder.decode( getBytes( decodeURIComponent(str))));

Note that two trailing characters in your string (percent encoded as %06%88) are non-printable codes

- `␆` (U+0006,  *Acknowledge*)
- `` (U+0088,  *Character Tabulation Set*)